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GLYCOSIDASE ENZYMES 
BACKGROUND OF THE INVENTION 

1 . Field of the Inventions 

This invention relates to newly identified polynucleotides, polypeptides encoded by such 
5 polynucleotides, the use of such polynucleotides and polypeptides, as well as the 
production and isolation of such polynucleotides and polypeptides. More particularly, 
the polynucleotides and polypeptides of the present invention has been putatively 
identified as glucosidases, a-galactosidases, P-galactosidases, B-mannosidases, 8- 
mannanases, endogiucanases, and pullalanases. 

10 2. Description of Related Art 

The glycosidic bond of P-galactosides can be cleaved by different classes of enzymes: 

(i) phospho-P-galactosidases (EC3.2.1.85) are specific for a phosphorylated substrate 
generated viaphosphoenolpyruvate phosphotransferase system (PTS)-dependent uptake; 

(ii) typical p-galactosidases (EC 3.2.1.23), represented by the Escherichia coli LacZ 
15 enzyme, which are relatively specific for P-galactosides; and (iii) P-glucosidases (EC 

3.2.1.21) such as the enzymes of Agrobacterium faecalis, Clostridium thermocelium, 
Pyrococcus furiosus or Sulfolobus solfataricus (Day, A.G. and Withers, S.G., (1986) 
Purification and characterization of a p-glucosidase from Alcaligenes faecalis. Can. J. 
Biochem. Cell. Biol. 64, 914-922; Kengen, S.W.M, et al. (1993) Eur. J. Biochem., 213, 

20 305-3 1 2; Ait, N., Cruezet, N. and Cattaneo, J. (1 982) Properties of p-glucosidase purified 
from Clostridium thermocelium. J. Gen. Microbiol. 128, 569-577; Grogan, D.W. (1991) 
Evidence that P-galactosidase of Sulfolobus solfataricus is only one of several activities 
of a thermostable P-D-glycodiase. Appl. Environ. Microbiol. 57, 1644-1649). Members 
of the latter group, although highly specific with respect to the p-anomeric configuration 

25 of the glycosidic linkage, often display a rather relaxed substrate specificity and 
hydrolyze P-glucosides as well as P-fucosides and P-galactosides. 
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Generally, a-galactosidases are enzymes that catalyze the hydrolysis of galactose groups 
on a polysaccharide backbone or hydrolyze the cleavage of di- or oligosaccharides 
comprising galactose. 

Generally, B-mannanases are enzymes that catalyze the hydrolysis of mannose groups 
5 internally on a polysaccharide backbone or hydrolyze the cleavage of di- or 
oligosaccaharides comprising mannose groups, B-mannosidases hydrolyze non-reducing, 
terminal mannose residues on a mannose-containing polysaccharide and the cleavage of 
di- or oligosaccaharides comprising mannose groups. 

Guar gum is a branched galactomannan polysaccharide composed of P-1,4 linked 
10 mannose backbone with a- 1,6 linked galactose side chains. The enzymes required for 
the degradation of guar are (3-mannanase, P-mannosidase and a-galactosidase. P- 
mannanase hydrolyses the mannose backbone internally and p-mannosidase hydrolyses 
non-reducing, terminal mannose residues, a-galactosidase hydrolyses a-linked galactose 
groups. 

15 Galactomannan polysaccharides and the enzymes that degrade them have a variety of 
applications. Guar is commonly used as a thickening agent in food and is utilized in 
hydraulic fracturing in oil and gas recovery. Consequently, galactomannanases are 
industrially relevant for the degradation and modification of guar. Furthermore, a need 
exists for thermostable galactomannases that are active in extreme conditions associated 

20 with drilling and well stimulation. 

There are other applications for these enzymes in various industries, such as in the beet 
sugar industry. 20-30% of the domestic U.S. sucrose consumption is sucrose from sugar 
beets. Raw beet sugar can contain a small amount of raffinose when the sugar beets are 
stored before processing and rotting begins to set in. Raffinose inhibits the 
25 crystallization of sucrose and also constitutes a hidden quantity of sucrose. Thus, there 
is merit to eliminating raffinose from raw beet sugar. a-Galactosidase has also been used 

2, 
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as a digestive aid to break down raffinose, stachyose, and verbascose in such foods as 
beans and other gassy foods. 

p-galactosidases which are active and stable at high temperatures appear to be superior 
enzymes for the production of lactose-free dietary milk products (Chaplin, M.F. and 
5 Bucke, C. (1990) In: Enzyme Technology, pp. 159-160, Cambridge University Press, 
Cambridge, UK). Also, several studies have demonstrated the applicability of p- 
galactosidases to the enzymatic synthesis of oligosaccharides via transglycosylation 
reactions (Nilsson, K.G.I. (1988) Enzymatic synthesis of oligosaccharides. Trends 
BiotechnoL 6, 156-264; Cote, G.L. and Tao, B.Y. (1990) Oligosaccharide synthesis by 

10 enzymatic transglycosylation. Glycoconjugate J. 7, 145-162). Despite the commercial 
potential, only a few P-galactosidases of thermophiles have been characterized so far. 
Two genes reported are P-galactoside-cleaving enzymes of the hyperthermophilic 
bacterium Thermotoga maritima, one of the most thermophilic organotrophic eubacteria 
described to date (Huber, R., Langworthy, T.A., Konig, H., Thomm, M. 5 Woese, C.R., 

15 Sleytr, U.B. and Stetter, K.O. (1986) T. martima sp. nov. represents a new genus of 
unique extremely thermophilic eubacteria growing up to 90 °C, Arch. Microbiol. 144, 
324-333) one of the most thermophilic organotrophic eubacteria described to date. The 
gene products have been identified as a P-galactosidase and a p-glucosidase. 

Pullulanase is well known as a debranching enzyme of pullulan and starch. The enzyme 
20 hydrolyzes os-l,6-glucosidic linkages on these polymers. Starch degradation for the 
production or sweeteners (glucose or maltose) is a very important industrial application 
of this enzyme. The degradation of starch is developed in two stages. The first stage 
involves the liquefaction of the substrate with a-amylase, and the second stage, or 
saccharification stage, is performed by B-amylase with pullalanase added as a 
25 debranching enzyme, to obtain better yields. 

Endoglucanases can be used in a variety of industrial applications. For instance, the 

endoglucanases of the present invention can hydrolyze the internal fi-l,4-glycosidic 

3 



WO 98/24799 



PCT/US97/22623 



bonds in cellulose, which may be used for the conversion of plant biomass into fuels and 
chemicals. Endoglucanases also have applications in detergent formulations, the textile 
industry, in animal feed, in waste treatment, and in the fruit juice and brewing industry 
for the clarification and extraction of juices. 
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Brief Description of the Drawings 

The following drawings are illustrative of embodiments of the invention and are not 
meant to limit the scope of the invention as encompassed by the claims. 

Figures 1 a-b are the full-length DNA and corresponding deduced amino acid sequence 
5 of Ml 1TL of the present invention. Sequencing was performed using a 378 automated 
DNA sequencer for all sequences of the present invention (Applied Biosystems, Inc.). 

Figure 2 is an illustration of the full-length DNA and corresponding deduced amino acid 
sequence of OC1/4V-33B/G. 

Figure 3 is an illustration of the full-length DNA and corresponding deduced amino acid 
1 0 sequence of F 1 - 1 2G. 

Figures 4a-b are the full-length DNA and corresponding deduced amino acid sequence 
of9N2-31B/G. 

Figures 5a-b are the full-length DNA and corresponding deduced amino acid sequence 
ofMSB8-6G. 

15 Figure 6 is the full-length DNA and corresponding deduced amino acid sequence of 
AEDII12RA-18B/G. 

Figures 7a-b are the full-length DNA and corresponding deduced amino acid sequence 
ofGC74-22G. 

Figures 8a-b are the full-length DNA and corresponding deduced amino acid sequence 
20 ofVCl-7Gl. 

S 
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Figures 9a-c are the full-length DNA and corresponding deduced amino acid sequence 
of37GPl. 

Figures lOa-c are the full-length DNA and corresponding deduced amino acid sequence 
of6GC2. 

5 Figures 1 la-d are the full-length DNA and corresponding deduced amino acid sequence 
of6GP2. 

Figures 12a-c are the full-length DNA and corresponding deduced amino acid sequence 
of63GBl. 

Figures 13a-b are the full-length DNA and corresponding deduced amino acid sequence 
10 ofOCl/4V. 

Figures 14a-e are the full-length DNA and corresponding deduced amino acid sequence 
of6GP3. 

Figures 15a-d are the full-length DNA and corresponding deduced amino acid sequence 
of Thermotoga maritima MSB8-6GP2. 

1 5 Figures 1 6a-c are the full-length DNA and corresponding deduced amino acid sequence 
of Thermotoga maritima MSB8-6GB4. 

Figures 17a-d are the full-length DNA and corresponding deduced amino acid sequence 
of Bankigouldi 37GV4. 

Figures 18a-b are the full-length DNA and corresponding deduced amino acid sequence 
20 of PyrococcusfuriosusVC\-lEGl. 
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SUMMARY OF THE INVENTION 

In a preferred embodiment of the present invention, there are provided isolated nucleic 
acids (polynucleotides) which encode mature enzymes having the deduced amino acid 
sequences of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64). 

5 In another embodiment, the invention provides a method for producing a polypeptide 
including culturing host cells containing the polynucleotide of Figures 1-18 and 
expressing from the host cell a polypeptide encoded by the polynucleotide and isolating 
the polypeptide. 

In another embodiment, the invention provides an enzyme selected from the group 
1 0 consisting of an enzyme having an amino acid sequence set forth in SEQ ID NOS: 15-28 
or 61-64 and an enzyme which has at least 30 consecutive amino acid residue as an 
enzyme having an amino acid sequence set forth in SEQ ID NOS: 15-28 or 61-64. 

In yet another embodiment, the invention provides a method for generating glucose from 
soluble cell oligosaccharides which includes contacting a sample containing 
15 oligosaccharides with an effective amount of an enzyme selected from the group of 
enzymes having the amino acid sequence set forth in SEQ ID NOS : 1 5-28, 6 1 -63 and 64 
such that glucose is produced 

The publications discussed herein are provided solely for their disclosure prior to the 
filing date of the present application. Nothing herein is to be construed as an admission 
20 that the invention is not entitled to antedate such disclosure by virtue of prior invention. 



Definitions 

"Monosaccharide", as used herein, refers to a single polyhydroxy aldehyde or ketone 
unit. -7 
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"Oligosaccharide", as used herein, consist of short chains of monosaccharide units joined 
together by covalent bonds. Of these, the most abundant are the disaccharides, which 
have two monosaccharide units. 

"Polysaccharide", as used herein, consists of long chains having many monosaccharide 
5 units. 

The term "gene" means the segment of DNA involved in producing a polypeptide chain; 
it includes regions preceding and following the coding region (leader and trailer) as well 
as intervening sequences (introns) between individual coding segments (exons). 

A coding sequence is "operably linked to" another coding sequence when RNA 
1 0 polymerase will transcribe the two coding sequences into a single mRN A, which is then 
translated into a single polypeptide having amino acids derived from both coding 
sequences. The coding sequences need not be contiguous to one another so long as the 
expressed sequences ultimately process to produce the desired protein. 

"Recombinant" enzymes refer to enzymes produced by recombinant DNA techniques; 
15 i.e., produced from cells transformed by an exogenous DNA construct encoding the 
desired enzyme. "Synthetic" enzymes are those prepared by chemical synthesis. 

A DNA "coding sequence of or a "nucleotide sequence encoding" a particular enzyme, 
is a DNA sequence which is transcribed and translated into an enzyme when placed 
under the control of appropriate regulatory sequences. 

20 Detailed Description of the Invention 

The polynucleotides and polypeptides of the present invention have been identified as 

glucosidases, oe-galactosidases, p-galactosidases, B-mannosidases, B-mannanases, 

endoglucanases, and pullalanases as a result of their enzymatic activity. 

% 
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In accordance with one aspect of the present invention, there are provided novel 
enzymes, as well as active fragments, analogs and derivatives thereof. 

In accordance with another aspect of the present invention, there are provided isolated 
nucleic acid molecules encoding the enzymes of the present invention including mRNAs, 
5 cDN As, genomic DNAs as well as active analogs and fragments of such enzymes. 

In accordance with yet a further aspect of the present invention, there is provided a 
process for producing such polypeptides by recombinant techniques comprising culturing 
recombinant prokaryotic and/or eukaryotic host cells, containing a nucleic acid sequence 
of the present invention, under conditions promoting expression of said enzymes and 
1 0 subsequent recovery of said enzymes. 

In accordance with yet a further aspect of the present invention, there is provided a 
process for utilizing such enzymes, or polynucleotides encoding such enzymes for 
hydrolyzing lactose to galactose and glucose for use in the food processing industry, the 
pharmaceutical industry, for example, to treat intolerance to lactose, as a diagnostic 
1 5 reporter molecule, in com wet milling, in the fruit juice industry, in baking, in the textile 
industry and in the detergent industry. 

In accordance with yet a further aspect of the present invention, there is provided a 
process for utilizing such enzymes for hydrolyzing guar gum (a galactomannan 
polysaccharide) to remove non-reducing terminal mannose residues. Further 

20 polysaccharides such as galactomannan and the enzymes according to the invention that 
degrade them have a variety of applications. Guar gum is commonly used as a 
thickening agent in food and also is utilized in hydraulic fracturing in oil and gas 
recovery. Consequently, mannanases are industrially relevant for the degradation and 
modification of guar gums. Furthermore, a need exists for thermostable mannases that 

25 are active in extreme conditions associated with drilling and well stimulation. 
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In accordance with yet a further aspect of the present invention, there are also provided 
nucleic acid probes comprising nucleic acid molecules of sufficient length to specifically 
hybridize to a nucleic acid sequence of the present invention. 

In accordance with yet a further aspect of the present invention, there is provided a 
5 process for utilizing such enzymes, or polynucleotides encoding such enzymes, for in 
vitro purposes related to scientific research, for example, to generate probes for 
identifying similar sequences which might encode similar enzymes from other organisms 
by using certain regions, i.e., conserved sequence regions, of the nucleotide sequence. 

These and other aspects of the present invention should be apparent to those skilled in 
10 the art from the teachings herein. 

The polynucleotides of this invention were originally recovered from genomic gene 
libraries derived from the following organisms: 

Ml 1TL is a new species of Desulfwococcus isolated from Diamond Pool in Yellowstone 
National Park. The organism grows optimally at 85-88 °C, pH 7.0 in a low salt medium 
1 5 containing yeast extract, peptone, and gelatin as substrates with a Nj/COj gas phase. 

OC1/4V is from the genus Thermotoga. The organism was isolated from Yellowstone 
National Park. It grows optimally at 75 °C in a low salt medium with cellulose as a 
substrate and N 2 in gas phase. 

Pyrococcus Juriosus VC1 and (7EG1) is from the genus Pyrococcus. VC1 was isolated 
20 from Vulcano, Italy. It grows optimally at 100°C in a high salt medium (marine) 
containing elemental sulfur, yeast extract, peptone and starch as substrates and N 2 in gas 
phase. 

10 
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Siaphylothermus marinus Fl is a from the genus Staphylothermus. Fl was isolated from 
Vulcano, Italy. It grows optimally at 85 °C, pH 6.5 in high salt medium (marine) 
containing elemental sulfur and yeast extract as substrates and N 2 in gas phase. 

Thermococcus 9N-2 is from the genus Thermococcus 9N-2 was isolated from diffuse 
5 vent fluid in the East Pacific Rise. It is a strict anaerobe that grows optimally at 87°C. 

Thermotoga maritima MSB8 and MSB8 (Clone # 6GP2 and 6GB4) is from the genus 
Thermotogo, and was isolated from Vulcano, Italy. MSB8 grows optimally at 85 °C, pH 
6.5 in a high salt medium (marine) containing starch and yeast extract as substrates and 
N 2 in gas phase. 

1 0 Thermococcus alcaliphilus AEDII 1 2RA is from the genus Thermococcus, AEDII 1 2RA 
grows optimally at 85 °C, pH 9.5 in a high salt medium (marine) containing polysulfides 
and yeast extract as substrates and N 2 in gas phase. 

Thermococcus chitonophagus GC74 is from the genus Thermococcus. GC74 grows 
optimally at 85 °C, pH 6.0 in a high salt medium (marine) containing chitin, meat extract, 
15 elemental sulfur and yeast extract as substrates and N 2 in gas phase. AEPII la grows 
optimally at 85 °C at pH 6.5 in marine medium under anaerobic conditions. It has many 
substrates. Bankia gouldi is from the genus Bankia. 

Accordingly, the polynucleotides and enzymes encoded thereby are identified by the 
organism from which they were isolated, and are sometimes hereinafter referred to as 

20 "Ml 1TL" (Figure 1 and SEQ ID NOS:l and 15), "OC1/4V-33B/G" (Figure 2 and SEQ 
ID NOS:2 and 16), "F1-12G" (Figure 3 and SEQ ID NOS:3 and 17), M 9N2-31B/G" 
(Figure 4 and SEQ ID NOS:4 and 18), "MSB8" (Figure 5 and SEQ ID NOS:5 and 19), 
"AEDII 1 2RA- 1 8B/G" (Figure 6 and SEQ ID NOS:6 and 20), "GC74-22G" (Figure 7 and 
SEQ ID NOS:7 and 21), "VC1-7G1" (Figure 8 and SEQ ID NOS:8 and 22), M 37GP1" 

25 (Figure 9 and SEQ ID NOS: 9 and 23), "6GC2" (Figure 10 and SEQ ID NOS: 10 and 

If 
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24), "6GP2" (Figure 1 1 and SEQ ID NOS:l 1 and 25), "AEPII la" (Figure 12 and SEQ 
ID NOS:12 and 26), "OC1/4V" (Figure 13 and SEQ ID NOS:13 and 27), and "6GP3" 
* (Figure 14 and SEQ ID NOS:28), "MSB8-6GP2" (Figure 15 and SEQ ID NOS:57 and 
61), "MSB8-6GB4"(Figure 16 and SEQ ID NOS:58 and 62),"VCl-7EGl"(Figure 17 and 
5 SEQ ID NOS:59 and 63), and 37GP4 (Figure 1 8 and SEQ ID NOS:60 and 64). 

The polynucleotides and polypeptides of the present invention show identity at the 
nucleotide and protein level to known genes and proteins encoded thereby as shown in 
Table 1. 

Table 1 











: Nucleic 






: Gene/Protein with •: "A 


Protein' i.- 


V Acid . ; . \ 


10 


-'Clone 


Closest Homoiogy . :■ 


.a Iden% '.. 


Identity : - : 




M11TL-29G 


Sulfolobus sulfataricus 

DSM1616/Pl,p- 

galactosidase 


51% 


55% 




OC1/4V-33B/G 


Caldocellum 
saccharolyticum, P- 
glucosidase 


52% 


57% 




Staphylothermus 


Bacillus polymyxa, P- 


36% 


48% 




marinus F1-12G 


galactosidase 






15 


Thermococcus 9N2- 
31B/G 


Sulfolobus sulfataricus 
ATCC49255/MT4, p- 
galactosidase 


51% 


50% 




Thermotoga maritima 


Clostridium thermocellum 


45% 


53% 




MSB8-6G 


bglB 








Thermococcus 


Bacillus polymyxa, P- 


34% 


48% 


20 


AEDD12RA-18B/G 


galactosidase 
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Thermococcus 
chitonophagus GC74- 
22G 


Sulfolobus sulfataricus 
ATCC 49255/MT4, p- 
galactosidase 


46% 


54% 


Pyrococcus furiosus 
VC1-7G1 


Sulfolobus 
sulfataricus/MT-4 p- 
gaiactosidase 


46.4% 


52.5% 


Thermotoga maritima 

a-galactosidase 

(6GC2) 


Pediococcus pentosaceaus 
a-galactosidase 


49% 


29% 


Thermotoga maritima 
fl-mannanase (6GP2) 


Aspergillus aculeatus 
mannanase 


56% 


37% 


AEPII laB- 
mannosidase (63GB1) 


Sulfolobus solfactaricus fl- 
galactosidase 


78% 


56% 


OC1/4V 

endoglucanase 

(33GP1) 


Clostridium thermocellum 
endo-1 ,4-B-endoglucanase 


65% 


43% 


Thermotoga mariti&aldo 
pullalanase (6GP3) 


cellum 
saccharolyticum a- 
destrom 6 
glucanohydralase 


72 


53 


Bankia gouldi mix 

Endoglucanase 

(37GP1) 


None available 







The polynucleotides and enzymes of the present invention show homology to each other 
as shown in Table 2. 
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Table 2 



"' : Glone ; ' 


Qene/Profe^ ~ 
Closed Hom^ 


Protein : : 
Identity 


^.Nucleic:: ; 

Acid 
"Identity: 


Staphylothermus 
marinus Y\-\2G 


Thermococcus 
AEDII 1 2RA- 1 8B/G, p- 
galactosidase, glucosidase 


55% 


57% 


Thermococcus 9N2- 
31B/G 


Thermococcus 
chitonophagus GC74- 
22G-glucosidase* 


74% 


66% 


Pyrococcus furiosus 
VC1-7G1 


Pyrococcus furiosus VC1 - 
7B/G p-galactosidase 


46.4% 


54% 



All the clones identified in Tables 1 and 2 encode polypeptides which have a-glycosidase 
1 0 or P-glycosidase activity. 

This invention, in addition to the isolated nucleic acid molecules encoding the enzymes 
of the present invention, also provide substantially similar sequences. Isolated nucleic 
acid sequences are substantially similar if: {i) they are capable of hybridizing under 
conditions hereinafter described, to the polynucleotides of SEQ IDNOS: 1-14 and 57-60; 

1 5 (ii) or they encode DN A sequences which are degenerate to the polynucleotides of SEQ 
ID NOS: 1-14 and 57-60. Degenerate DNA sequences encode the amino acid sequences 
of SEQ ED NOS:15-28 and 61-64, but have variations in the nucleotide coding 
sequences. As used herein, substantially similar refers to the sequences having similar 
identity to the sequences of the instant invention. The nucleotide sequences that are 

20 substantially the same can be identified by hybridization or by sequence comparison. 
Enzyme sequences that are substantially the same can be identified by one or more of the 
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following: proteolytic digestion, gel electrophoresis and/or microsequencing. 

One means for isolating the nucleic acid molecules encoding the enzymes of the present 
invention is to probe a gene library with a natural or artificially designed probe using art 
recognized procedures (see, for example: Current Protocols in Molecular Biology, 

5 Ausubel F.M. et al (EDS.) Green Publishing Company Assoc. and John Wiley 
Interscience, New York, 1989, 1992). It is appreciated to one skilled in the art that the 
polynucleotides of SEQ ID NOS: 1-14 and 57-60 or fragments thereof (comprising at 
least 1 2 contiguous nucleotides), are particularly useful probes. Other particular useful 
probes for this purpose are hybridizable fragments to the sequences of SEQ ID NOS: 1 - 

10 14 and 57-60 (i.e., comprising at least 12 contiguous nucleotides). 

With respect to nucleic acid sequences which hybridize to specific nucleic acid 
sequences disclosed herein, hybridization may be carried out under conditions of reduced 
stringency, medium stringency or even stringent conditions. As an example of 
oligonucleotide hybridization, a polymer membrane containing immobilized denatured 

1 5 nucleic acids is first prehybridized for 30 minutes at 45 °C in a solution consisting of 0.9 
M NaCl, 50 mM NaH 2 P0 4 , pH 7.0, 5.0 mM Na^DTA, 0.5% SDS, 10X Denhardt's, and 
0,5 mg/ml polyriboadenylic acid. Approximately 2 X 10 7 cpm (specific activity 4-9 X 
10 8 cpm/ug) of 32 P end-labeled oligonucleotide probe are then added to the solution. 
After 12-16 hours of incubation, the membrane is washed for 30 minutes at room 

20 temperature in IX SET (150 mM NaCl, 20 mM Tris hydrochloride, pH 7.8, 1 mM 
Na^DTA) containing 0.5% SDS, followed by a 30 minute wash in fresh IX SET at Tm 
10°C for the oligonucleotide probe. The membrane is then exposed to auto-radiographic 
film for detection of hybridization signals. 

Stringent conditions means hybridization will occur only if there is at least 90% identity, 

25 preferably at least 95% identity and most preferably at least 97% identity between the 

sequences. Further, it is understood that a section of a 1 00 bps sequence that is 95 bps 

in length has 95% identity with the 1090 bps sequence from which it is obtained. See J. 

IS 
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Sambrook et al, Molecular Cloning, A Laboratory Manual 2d Ed. , Cold Spring Harbor 
Laboratory (1989) which is hereby incorporated by reference in its entirety. Also, it is 
understood that a fragment of a 100 bps sequence that is 95 bps in length has 95% 
identity with the 100 bps sequence from which it is obtained. 

5 As used herein, a first DNA (RNA) sequence is at least 70% and preferably at least 80% 
identical to another DNA (RNA) sequence if there is at least 70% and preferably at least 
a 80% or 90% identity, respectively, between the bases of the first sequence and the 
bases of the another sequence, when properly aligned with each other, for example when 
aligned by BLASTN. 

10 "Identity" as the term is used herein, refers to a polynucleotide sequence which 
comprises a percentage of the same bases as a reference polynucleotide (SEQ ED NOS: 1 - 
14 and 57-60), For example, a polynucleotide which is at least 90% identical to a 
reference polynucleotide, has polynucleotide bases which are identical in 90% of the 
bases which make up the reference polynucleotide and may have different bases in 10% 

15 of the bases which comprise that polynucleotide sequence. 

The present invention relates polynucleotides which differ from the reference 
polynucleotide such that the changes are silent changes, for example the change do not 
alter the amino acid sequence encoded by the polynucleotide. The present invention also 
relates to nucleotide changes which result in amino acid substitutions, additions, 
20 deletions, fusions and truncations in the polypeptide encoded by the reference 
polynucleotide. In a preferred aspect of the invention these polypeptides retain the same 
biological action as the polypeptide encoded by the reference polynucleotide. 

It is also appreciated that such probes can be and are preferably labeled with an 
analytically detectable reagent to facilitate identification of the probe. Useful reagents 
25 include but are not limited to radioactivity, fluorescent dyes or enzymes capable of 
catalyzing the formation of a detectable product. The probes are thus useful to isolate 

/6 
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complementary copies of DNA from other sources or to screen such sources for related 
sequences. 

The polynucleotides of this invention were recovered from genomic gene libraries from 
the organisms listed in Table 1 . For example, gene libraries can be generated in the 
5 Lambda ZAP II cloning vector (Stratagene Cloning Systems). Mass excisions can be 
performed on these libraries to generate libraries in the pBluescript phagemid. Libraries 
are thus generated and excisions performed according to the protocols/methods 
hereinafter described. 

The excision libraries are introduced into the E. coli strain BW14893 F'kanlA. 
1 0 Expression clones are then identified using a high temperature filter assay. Expression 
clones encoding several glucanases and several other glycosidases are identified and 
repurified. The polynucleotides, and enzymes encoded thereby, of the present invention, 
yield the activities as described above. 

The coding sequences for the enzymes of the present invention were identified by 
1 5 screening the genomic DNAs prepared for the clones having glucosidase or galactosidase 
activity. 

An example of such an assay is a high temperature filter assay wherein expression clones 
were identified by use of high temperature filter assays using buffer Z (see recipe below) 
containing 1 mg/ml of the substrate 5-bromo-4-chloro-3-indolyl-P-D-glucopyranoside 
20 (XGLU) (Diagnostic Chemicals Limited or Sigma) after introducing an excision library 
into the E. coli strain BW14893 Fkanl A. Expression clones encoding XGLUases were 
identified and repurified from M11TL, OC1/4V, Pyrococcus furiosus VC1, 
Staphylothemus marinus Fl, Thermococcus 9N-2, Thermotoga maritima MSB8, 
Thermococcus alcaliphilus AEDII12RA, and Thermococcus chitonophagus GC74. 

17 
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Z-buffer: (referenced in Miller, J.H. (1992) A Short Course in Bacterial Genetics, p. 
445.) 

per liter: 

Na2HP0 4 -7H 2 0 16.1g 
5 NaH 2 P0 4 -7H 2 0 5.5g 

KC1 0.75g 
MgS0 4 -7H 2 0 0.246g 
P-mercaptoethanol 2.7ml 
Adjust pH to 7.0 
10 High Temperature Filter Assay 

(1) The f factor fkan (from E. coli strain CSH1 18)(1) was introduced into the 
pho-pnh-lac-strain BW14893(2). BW13893(2). The filamentous phage 
library was plated on the resulting strain, BW14893 Fkan. (Miller, J.H. 
(1992) A Short Course in Bacterial Genetics; Lee, K.S., Metcalf, et al., 

15 (1 992) Evidence for two phosphonate degradative pathways in Enterobacter 

Aerogenes, J. Bacteriol., 174:2501-2510. 

(2) After growth on 100 mm LB plates containing 100 fig/ml ampicillin, 80 
^g/ml nethicillin and ImM IPTG, colony lifts were performed using 
Millipore HATF membrane filters. 

20 (3) The colonies transferred to the filters were lysed with chloroform vapor in 

150 mm glass petri dishes. 
(4) The filters were transferred to 1 00 mm glass petri dishes containing a piece 

of Whatman 3MM filter paper saturated with buffer. 

(a) when testing for galactosidase activity (XGALase), 3MM paper 
25 was saturated with Z buffer containing 1 mg/ml XGAL (ChemBridge 

Corporation). After transferring filter bearing lysed colonies to the 
glass petri dish, placed dish in oven at 80-85 °C. 
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(b) when testing for glucosidase (XGLUase), 3 MM paper was 
saturated with Z buffer containing 1 mg/ml XGLU. After transferring 
filter bearing lysed colonies to the glass petri dish, placed dish in 
oven at 80-85 °C. 

5 (5) "Positives" were observed as blue spots on the filter membranes. Used the 
following filter rescue technique to retrieve plasmid from lysed positive 
colony. Used pasteur pipette (or glass capillary tube) to core blue spots on 
the filter membrane. Placed the small filter disk in an Eppendorf tube 
containing 20 |il water. Incubated the Eppendorf tube at 75 °C for 5 minutes 
followed by vortexing to elute plasmid DNA off filter. This DNA was 
transformed into electrocompetent E. coli cells DH10B for Thermatoga 
maritima MSB8-6G, Staphylothermus marinus Fl-120, Thermococcus 
AEDII12RA-18B/G, Thermococcus chitonophagus GC74-22G, Ml 1T1 and 
OC1/4V. Electrocompetent BW14893 FkanlA £ coli were used for 
1 5 Thermococcus 9N2-3 1 B/G, and Pyrococcus furiosus VC 1 -7G 1 . Repeated 

filter-lift assay on transformation plates to identify 'positives'. Return 
transformation plates to 37 °C incubator after filter lift to regenerate colonies. 
Inoculate 3 ml LB liquid containing 100 ng/ml ampicillin with repurified 
positives and incubate at 37°C overnight. Isolate plasmid DNA from these 
20 cultures and sequence plasmid insert. In some instances where the plates 

used for the initial colony lifts contained non-confluent colonies, a specific 
colony corresponding to a blue spot on the filter could be identified on a 
regenerated plate and repurified directly, instead of using the filter rescue 
technique. 



25 Another example of such an assay is a variation of the high temperature filter assay 
wherein colony-laden filters are heat-killed at different temperatures (for example, 105°C 
for 20 minutes) to monitor thermostability. The 3 MM paper is saturated with different 
buffers (i.e., 100 mM NaCl, 5 mM MgCl 2) 100 mM Tris-Cl (pH 9.5)) to determine 
enryme activity under different buffer conditions. 
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A p-glucosidase assay may also be employed, wherein GlcpPNp is used as an artificial 
substrate (aryl-P-glucosidase). The increase in absorbance at 405 nm as a result of p- 
nitrophenol (pNp) liberation was followed on a Hitachi U-1100 spectrophotometer, 
equipped with a thermostatted cuvette holder. The assays may be performed at 80°C or 
5 90 °C in closed 1-ml quartz cuvette. A standard reaction mixture contains 150 mM 
trisodium substrate, pH 5.0 (at 80°C), and 0.95 mM pNp derivative pNp = 0.561 mM -1 
cm' 1 ). The reaction mixture is allowed to reach the desired temperature, after which the 
reaction is started by injecting an appropriate amount of enzyme (1.06 ml final volume). 

1 U P-glucosidase activity is defined as that amount required to catalyze the formation 
10 of 1 .0 txmo\ pNp/min. D-cellobiose may also be used as a substrate. 

An ONPG assay for P-galactosidase activity is described by Miller, J.H. (1992) A Short 
Course in Bacterial Genetics and Mill, J.H. (1992) Experiments in Molecular Genetics, 
the contents of which are hereby incorporated by reference in their entirety. 

A quantitative fluorometric assay for P-galactosidase specific activity is described by : 
15 Youngman P., (1987) Plasmid Vectors for Recovering and Exploiting Tn917 
Transpositions in Bacillus and other Gram-Positive Bacteria. In Plasmids: A Practical 
approach (ed, K.Hardy) pp 79-103. IRL Press, Oxford. A description of the procedure 
can be found in Miller (1992) p. 75-77, the contents of which are incorporated by 
reference herein in their entirety. 

20 The polynucleotides of the present invention may be in the form of DNA which DNA 
includes cDNA, genomic DNA, and synthetic DNA. The DNA may be double-stranded 
or single-stranded, and if single stranded may be the coding strand or non-coding (anti- 
sense) strand. The coding sequences which encodes the mature enzymes may be 
identical to the coding sequences shown in Figures 1-8 (SEQ ID NOS: 1-14 and 57-60) 
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or may be a different coding sequence which coding sequence, as a result of the 
redundancy or degeneracy of the genetic code, encodes the same mature enzymes as the 
DNA of Figures 1-18 (SEQ ID NOS: 1-14 and 57-60). 

The polynucleotide which encodes for the mature enzyme of Figures 1-18 (SEQ ID NOS: 
5 1 5-28 and 61-64) may include, but is not limited to: only the coding sequence for the 
mature enzyme; the coding sequence for the mature enzyme and additional coding 
sequence such as a leader sequence or a proprotein sequence; the coding sequence for the 
mature enzyme (and optionally additional coding sequence) and non-coding sequence, 
such as introns or non-coding sequence 5' and/or 3' of the coding sequence for the mature 
10 enzyme. 

Thus, the term "polynucleotide encoding an enzyme (protein)" encompasses a 
polynucleotide which includes only coding sequence for the enzyme as well as a 
polynucleotide which includes additional coding and/or non-coding sequence. 

The present invention further relates to variants of the hereinabove described 
1 5 polynucleotides which encode for fragments, analogs and derivatives of the enzymes 
having the deduced amino acid sequences of Figures 1-18 (SEQ ID NOS: 15-28 and 61- 
64). The variant of the polynucleotide may be a naturally occurring allelic variant of the 
polynucleotide or a non-naturally occurring variant of the polynucleotide. 

Thus, the present invention includes polynucleotides encoding the same mature enzymes 
20 as shown in Figures 1-18 (SEQ ID NOS: 15-28 and 61-64) as well as variants of such 
polynucleotides which variants encode for a fragment, derivative or analog of the 
enzymes of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64). Such nucleotide variants 
include deletion variants, substitution variants and addition or insertion variants. 

As hereinabove indicated, the polynucleotides may have a coding sequence which is a 
25 naturally occurring allelic variant of the coding sequences shown in Figures 1-18 (SEQ 
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ID NOS: 1-14 and 57-60). As known in the art, an allelic variant is an alternate form of 
a polynucleotide sequence which may have a substitution, deletion or addition of one or 
more nucleotides, which does not substantially alter the function of the encoded enzyme. 

Fragments of the full length gene of the present invention may be used as a hybridization 
5 probe for a cDNA or a genomic library to isolate the full length DNA and to isolate other 
DNAs which have a high sequence similarity to the gene or similar biological activity. 
Probes of this type preferably have at least 10, preferably at least 15, and even more 
preferably at least 30 bases and may contain, for example, at least 50 or more bases. The 
probe may also be used to identify a DNA clone corresponding to a full length transcript 

10 and a genomic clone or clones that contain the complete gene including regulatory and 
promotor regions, exons, and introns. An example of a screen comprises isolating the 
coding region of the gene by using the known DNA sequence to synthesize an 
oligonucleotide probe. Labeled oligonucleotides having a sequence complementary to 
that of the gene of the present invention are used to screen a library of genomic DNA to 

1 5 determine which members of the library the probe hybridizes to. 

The present invention further relates to polynucleotides which hybridize to the 
hereinabove-described sequences if there is at least 70%, preferably at least 90%, and 
more preferably at least 95% identity between the sequences. The present invention 
particularly relates to polynucleotides which hybridize under stringent conditions to the 

20 hereinabove-described polynucleotides. As herein used, the term "stringent conditions" 
means hybridization will occur only if there is at least 95% and preferably at least 97% 
identity between the sequences. The polynucleotides which hybridize to the hereinabove 
described polynucleotides in a preferred embodiment encode enzymes which either retain 
substantially the same biological function or activity as the mature enzyme encoded by 

25 the DNA of Figures 1-18 (SEQ ID NOS: 1-14 and 57-60). 

Alternatively, the polynucleotide may have at least 15 bases, preferably at least 30 bases, 
and more preferably at least 50 bases which hybridize to any part of a polynucleotide of 
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the present invention and which has an identity thereto, as hereinabove described, and 
which may or may not retain activity. For example, such polynucleotides may be 
employed as probes for the polynucleotides of SEQ ID NOS: 1-14 and 57-60, for 
example, for recovery of the polynucleotide or as a diagnostic probe or as a PCR primer. 

5 Thus, the present invention is directed to polynucleotides having at least a 70% identity, 
preferably at least 90% identity and more preferably at least a 95% identity to a 
polynucleotide which encodes the enzymes of SEQ ID NOS: 15-28 and 61-64 as well as 
fragments thereof, which fragments have at least 15 bases, preferably at least 30 bases 
and most preferably at least 50 bases, which fragments are at least 90% identical, 
1 0 preferably at least 95% identical and most preferably at least 97% identical under 
stringent conditions to any portion of a polynucleotide of the present invention. 

The present invention further relates to enzymes which have the deduced amino acid 
sequences of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64) as well as fragments, analogs 
and derivatives of such enzyme. 

15 The terms "fragment," "derivative" and "analog" when referring to the enzymes of 
Figures 1-18 (SEQ ID NOS: 15-28 and 61-64) means enzymes which retain essentially 
the same biological function or activity as such enzymes. Thus, an analog includes a 
proprotein which can be activated by cleavage of the proprotein portion to produce an 
active mature enzyme. 

20 The enzymes of the present invention may be a recombinant enzyme, a natural enzyme 
or a synthetic enzyme, preferably a recombinant enzyme. 

The fragment, derivative or analog of the enzymes of Figures 1-18 (SEQ ID NOS: 15-28 
and 61-64) may be (i) one in which one or more of the amino acid residues are 
substituted with a conserved or non-conserved amino acid residue (preferably a 
25 conserved amino acid residue) and such substituted amino acid residue may or may not 
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be one encoded by the genetic code, or (ii) one in which one or more of the amino acid 
residues includes a substituent group, or (iii) one in which the mature enzyme is fused 
with another compound, such as a compound to increase the half-life of the enzyme (for 
example, polyethylene glycol), or (iv) one in which the additional amino acids are fused 
5 to the mature enzyme, such as a leader or secretory sequence or a sequence which is 
employed for purification of the mature enzyme or a proprotein sequence. Such 
fragments, derivatives and analogs are deemed to be within the scope of those skilled in 
the art from the teachings herein. 

The enzymes and polynucleotides of the present invention are preferably provided in an 
1 0 isolated form, and preferably are purified to homogeneity. 

The term "isolated" means that the material is removed from its original environment 
(e.g., the natural environment if it is naturally occurring). For example, a naturally- 
occurring polynucleotide or enzyme present in a living animal is not isolated, but the 
same polynucleotide or enzyme, separated from some or all of the coexisting materials 
15 in the natural system, is isolated. Such polynucleotides could be part of a vector and/or 
such polynucleotides or enzymes could be part of a composition, and still be isolated in 
that such vector or composition is not part of its natural environment. 

The enzymes of the present invention include the enzymes of SEQ ID NOS: 15-28 and 
61-64 (in particular the mature enzyme) as well as enzymes which have at least 70% 

20 similarity (preferably at least 70% identity) to the enzymes of SEQ ID NOS : 15-28 and 
61-64 and more preferably at least 90% similarity (more preferably at least 90% identity) 
to the enzymes of SEQ ID NOS : 1 5-28 and 6 1 -64 and still more preferably at least 95% 
similarity (still more preferably at least 95% identity) to the enzymes of SEQ ID NOS: 
15-28 and 61-64 and also include portions of such enzymes with such portion of the 

25 enzyme generally containing at least 30 amino acids and more preferably at least 50 
amino acids. 
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As known in the art "similarity" between two enzymes is determined by comparing the 
amino acid sequence and its conserved amino acid substitutes of one enzyme to the 
sequence of a second enzyme. 

A variant, i.e. a "fragment", "analog" or "derivative" polypeptide, and reference 
5 polypeptide may differ in amino acid sequence by one or more substitutions, additions, 
deletions, fusions and truncations, which may be present in any combination. 

Among preferred variants are those that vary from a reference by conservative amino 
acid substitutions. Such substitutions are those that substitute a given amino acid in a 
polypeptide by another amino acid of like characteristics. Typically seen as conservative 
1 0 substitutions are the replacements, one for another, among the aliphatic amino acids Ala, 
Val, Leu and He; interchange of the hydroxyl residues Ser and Thr, exchange of the 
acidic residues Asp and Glu, substitution between the amide residues Asn and Gin, 
exchange of the basic residues Lys and Arg and replacements among the aromatic 
residues Phe, Tyr. 

1 5 Most highly preferred are variants which retain the same biological function and activity 
as the reference polypeptide from which it varies. 

Fragments or portions of the enzymes of the present invention may be employed for 
producing the corresponding full-length enzyme by peptide synthesis; therefore, the 
fragments may be employed as intermediates for producing the full-length enzymes. 
20 Fragments or portions of the polynucleotides of the present invention may be used to 
synthesize full-length polynucleotides of the present invention. 

The present invention also relates to vectors which include polynucleotides of the present 
invention, host cells which are genetically engineered with vectors of the invention and 
the production of enzymes of the invention by recombinant techniques. 
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Host cells are genetically engineered (transduced or transformed or transfected) with the 
vectors of this invention which may be, for example, a cloning vector or an expression 
vector. The vector may be, for example, in the form of a plasmid, a viral particle, a 
phage, etc. The engineered host cells can be cultured in conventional nutrient media 
5 modified as appropriate for activating promoters, selecting transfoimants or amplifying 
the genes of the present invention. The culture conditions, such as temperature, pH and 
the like, are those previously used with the host cell selected for expression, and will be 
apparent to the ordinarily skilled artisan. 

The polynucleotides of the present invention may be employed for producing enzymes 
10 by recombinant techniques. Thus, for example, the polynucleotide may be included in 
any one of a variety of expression vectors for expressing an enzyme. Such vectors 
include chromosomal, nonchromosomal and synthetic DNA sequences, e.g., derivatives 
of SV40; bacterial plasmids; phage DNA; baculovirus; yeast plasmids; vectors derived 
from combinations of plasmids and phage DNA, viral DNA such as vaccinia, adenovirus, 
1 5 fowl pox virus, and pseudorabies. However, any other vector may be used as long as it 
is replicable and viable in the host. 

The appropriate DNA sequence may be inserted into the vector by a variety of 
procedures. In general, the DNA sequence is inserted into an appropriate restriction 
endonuclease site(s) by procedures known in the art. Such procedures and others are 
20 deemed to be within the scope of those skilled in the art. 

The DNA sequence in the expression vector is operatively linked to an appropriate 
expression control sequence(s) (promoter) to direct mRNA synthesis. As representative 
examples of such promoters, there may be mentioned: LTR or SV40 promoter, the E. 
coli. lac or tre, the phage lambda P L promoter and other promoters known to control 
25 expression of genes in prokaryotic or eukaryotic cells or their viruses. The expression 

2<S 
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vector also contains a ribosome binding site for translation initiation and a transcription 
terminator. The vector may also include appropriate sequences for amplifying 
expression. 

In addition, the expression vectors preferably contain one or more selectable marker 
5 genes to provide a phenotypic trait for selection of transformed host cells such as 
dihydrofolate reductase or neomycin resistance for eukaryotic cell culture, or such as 
tetracycline or ampicillin resistance in E. coli . 

The vector containing the appropriate DNA sequence as hereinabove described, as well 
as an appropriate promoter or control sequence, may be employed to transform an 
1 0 appropriate host to permit the host to express the protein. 

As representative examples of appropriate hosts, there may be mentioned: bacterial cells, 
such as E. coli . Streptomvces . Bacillus subtilis : fungal cells, such as yeast; insect cells 
such as Drosophila S2 and Spodoptera Sf9 ; animal cells such as CHO, COS or Bowes 
melanoma; adenoviruses; plant cells, etc. The selection of an appropriate host is deemed 
15 to be within the scope of those skilled in the art from the teachings herein. 

More particularly, the present invention also includes recombinant constructs comprising 
one or more of the sequences as broadly described above. The constructs comprise a 
vector, such as a plasmid or viral vector, into which a sequence of the invention has been 
inserted, in a forward or reverse orientation. In a preferred aspect of this embodiment, 

20 the construct further comprises regulatory sequences, including, for example, a promoter, 
operably linked to the sequence. Large numbers of suitable vectors and promoters are 
known to those of skill in the art, and are commercially available. The following vectors 
are provided by way of example; Bacterial: pQE70, pQE60, pQE-9 (Qiagen), pDIO, 
psiXl 74, pBluescript n KS, pNH8A, pNHl 6a, pNHl 8 A, pNH46A (Stratagene); ptrc99a, 

25 pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia); Eukaryotic: pSV2CAT, pOG44, 

Z-7 
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pXTl, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia). However, any other 
plasmid or vector may be used as long as they are replicable and viable in the host. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 
transferase) vectors or other vectors with selectable markers. Two appropriate vectors 
5 are pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, 
T7, gpt, lambda P R , P L and trp. Eukaryotic promoters include CMV immediate early, 
HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse 
metallothionein-L Selection of the appropriate vector and promoter is well within the 
level of ordinary skill in the art. 

10 In a further embodiment, the present invention relates to host cells containing the above- 
described constructs. The host cell can be a higher eukaryotic cell, such as a mammalian 
cell, or a lower eukaryotic cell, such as a yeast cell, or the host cell can be a prokaryotic 
cell, such as a bacterial cell. Introduction of the construct into the host cell can be 
effected by calcium phosphate transfection, DEAE-Dextran mediated transfection, or 

15 electroporation (Davis, L., Dibner, M., Battey, I., Basic Methods in Molecular Biology, 
(1986)). 

The constructs in host cells can be used in a conventional manner to produce the gene 
product encoded by the recombinant sequence. Alternatively, the enzymes of the 
invention can be synthetically produced by conventional peptide synthesizers. 

20 Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells 
under the control of appropriate promoters. Cell-free translation systems can also be 
employed to produce such proteins using KNAs derived from the DN A constructs of the 
present invention. Appropriate cloning and expression vectors for use with prokaryotic 
and eukaiyotic hosts are described by Sambrook, et al., Molecular Cloning: A Laboratory 

25 Manual, Second Edition, Cold Spring Harbor, N.Y., (1989), the disclosure of which is 
hereby incorporated by reference. 
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Transcription of the DNA encoding the enzymes of the present invention by higher 
eukaryotes is increased by inserting an enhancer sequence into the vector. Enhancers are 
cis-acting elements of DNA, usually about from 10 to 300 bp that act on a promoter to 
increase its transcription. Examples include the SV40 enhancer on the late side of the 
5 replication origin bp 100 to 270, a cytomegalovirus early promoter enhancer, the 
polyoma enhancer on the late side of the replication origin, and adenovirus enhancers. 

Generally, recombinant expression vectors will include origins of replication and 
selectable markers permitting transformation of the host cell, e.g., the ampicillin 
resistance gene of E. coli and S. cerevisiae TRP1 gene, and a promoter derived from a 

10 highly-expressed gene to direct transcription of a downstream structural sequence. Such 
promoters can be derived from operons encoding glycolytic enzymes such as 3- 
phosphoglycerate kinase (PGK), a-factor, acid phosphatase, or heat shock proteins, 
among others. The heterologous structural sequence is assembled in appropriate phase 
with translation initiation and termination sequences, and preferably, a leader sequence 

15 capable of directing secretion of translated enzyme. Optionally, the heterologous 
sequence can encode a fusion enzyme including an N-terminal identification peptide 
imparting desired characteristics, e.g., stabilization or simplified purification of 
expressed recombinant product. 

Useful expression vectors for bacterial use are constructed by inserting a structural DNA 
20 sequence encoding a desired protein together with suitable translation initiation and 
termination signals in operable reading phase with a functional promoter. The vector will 
comprise one or more phenotypic selectable markers and an origin of replication to 
ensure maintenance of the vector and to, if desirable, provide amplification within the 
host. Suitable prokaryotic hosts for transformation include E. coli . Bacillus subtilis , 
25 Salmonella tvphimurium and various species within the genera Pseudomonas, 
Streptomyces. and Staphylococcus, although others may also be employed as a matter 
of choice. 
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As a representative but nonlimiting example, useful expression vectors for bacterial use 
can comprise a selectable marker and bacterial origin of replication derived from 
commercially available plasmids comprising genetic elements of the well known cloning 
vector pBR322 (ATCC 37017). Such commercial vectors include, for example, 
5 pKK223-3 (Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM1 (Promega Biotec, 
Madison, WI, USA). These pBR322 "backbone" sections are combined with an 
appropriate promoter and the structural sequence to be expressed. 

Following transformation of a suitable host strain and growth of the host strain to an 
appropriate cell density, the selected promoter is induced by appropriate means (e.g., 
1 0 temperature shift or chemical induction) and cells are cultured for an additional period. 

Cells are typically harvested by centrifugation, disrupted by physical or chemical means, 
and the resulting crude extract retained for further purification. 

Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell 
1 5 lysing agents, such methods are well known to those skilled in the art. 

Various mammalian cell culture systems can also be employed to express recombinant 
protein. Examples of mammalian expression systems include the COS-7 lines of monkey 
kidney fibroblasts, described by Gluzman, Cell, 23:175 (1981), and other cell lines 
capable of expressing a compatible vector, for example, the C127, 3T3, CHO, HeLa and 

20 BHK cell lines. Mammalian expression vectors will comprise an origin of replication, 
_a suitable promoter and enhancer, and also any necessary ribosome . binding sites, 
polyadenylation site, splice donor and acceptor sites, transcriptional termination 
sequences, and 5* flanking nontranscribed sequences. DNA sequences derived from the 
SV40 splice, and polyadenylation sites may be used to provide the required 

25 nontranscribed genetic elements. 
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The enzyme can be recovered and purified from recombinant cell cultures by methods 
including ammonium sulfate or ethanol precipitation, acid extraction, anion or cation 
exchange chromatography, phosphocellulose chromatography, hydrophobic interaction 
chromatography, affinity chromatography, hydroxylapatite chromatography and lectin 
5 chromatography. Protein refolding steps can be used, as necessary, in completing 
configuration of the mature protein. Finally, high performance liquid chromatography 
(HPLC) can be employed for final purification steps. 

The enzymes of the present invention may be a naturally purified product, or a product 
of chemical synthetic procedures, or produced by recombinant techniques from a 
1 0 prokaryotic or eukaryotic host (for example, by bacterial, yeast, higher plant, insect and 
mammalian cells in culture). Depending upon the host employed in a recombinant 
production procedure, the enzymes of the present invention may be glycosylated or may 
be non-glycosylated. Enzymes of the invention may or may not also include an initial 
methionine amino acid residue. 

15 P-galactosidase hydrolyzes lactose to galactose and glucose. Accordingly, the OC1/4V, 
9N2-31B/G, AEDII 1 2RA- 1 8B/G and F1-12G enzymes may be employed in the food 
processing industry for the production of low lactose content milk and for the production 
of galactose or glucose from lactose contained in whey obtained in a large amount as a 
by-product in the production of cheese. Generally, it is desired that enzymes used in 

20 food processing, such as the aforementioned P-galactosidases, be stable at elevated 
temperatures to help prevent microbial contamination. 

These enzymes may also be employed in the pharmaceutical industry. The enzymes are 
used to treat intolerance to lactose. In this case, a thermostable enzyme is desired, as 
well. Thermostable P-galactosidases also have uses in diagnostic applications, where 
25 they are employed as reporter molecules. 
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Glucosidases act on soluble cellooligosaccharides from the non-reducing end to give 
glucose as the sole product. Glucanases (endo- and exo-) act in the depolymerization of 
cellulose, generating more non-reducing ends (endo-glucanases, for instance, act on 
internal linkages yielding cellobiose, glucose and cellooligosaccharides as products). 0- 

5 glucosidases are used in applications where glucose is the desired product. Accordingly, 
M11TL, F1-12G, GC74-22G, MSB8-6G , OC1/4V, VC1-7G1, 9N2-31B/G and 
AEDII12RA18B/G may be employed in a wide variety of industrial applications, 
including in corn wet milling for the separation of starch and gluten, in the fruit industry 
for clarification and equipment maintenance, in baking for viscosity reduction, in the 

10 textile industry for the processing of blue jeans, and in the detergent industry as an 
additive. For these and other applications, thermostable enzymes are desirable. 

Antibodies generated against the enzymes corresponding to a sequence of the present 
invention can be obtained by direct injection of the enzymes into an animal or by 
administering the enzymes to an animal, preferably a nonhuman. The antibody so 
15 obtained will then bind the enzymes itself. In this manner, even a sequence encoding 
only a fragment of the enzymes can be used to generate antibodies binding the whole 
native enzymes. Such antibodies can then be used to isolate the enzyme from cells 
expressing that enzyme. 

For preparation of monoclonal antibodies, any technique which provides antibodies 
20 produced by continuous cell line cultures can be used. Examples include the hybridoma 
technique (Kohler and Milstein, 1975, Nature, 256:495-497), the trioma technique, the 
human B-cell hybridoma technique (Kozbor et al., 1983, Immunology Today 4:72), and 
the EBV-hybridoma technique to produce human monoclonal antibodies (Cole, et al., 
1985, in Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 
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Techniques described for the production of single chain antibodies (U.S. Patent 
4,946,778) can be adapted to produce single chain antibodies to immunogenic enzyme 
products of this invention. Also, transgenic mice may be used to express humanized 
antibodies to immunogenic enzyme products of this invention. 

5 Antibodies generated against the enzyme of the present invention may be used in 
screening for similar enzymes from other organisms and samples. Such screening 
techniques are known in the art, for example, one such screening assay is described in 
"Methods for Measuring Cellulase Activities", Methods in enzymology, Vol 160, pp. 87- 
116, which is hereby incorporated by reference in its entirety. 

1 0 The present invention will be further described with reference to the following examples; 
however, it is to be understood that the present invention is not limited to such examples. 
All parts or amounts, unless otherwise specified, are by weight. 

In order to facilitate understanding of the following examples certain frequently 
occurring methods and/or terms will be described. 

1 5 "Plasmids" are designated by a lower case p preceded and/or followed by capital letters 
and/or numbers. The starting plasmids herein are either commercially available, publicly 
available on an unrestricted basis, or can be constructed from available plasmids in 
accord with published procedures. In addition, equivalent plasmids to those described 
are known in the art and will be apparent to the ordinarily skilled artisan. 

20 "Digestion" of DNA refers to catalytic cleavage of the DNA with a restriction enzyme 
that acts only at certain sequences in the DNA. The various restriction enzymes used 
herein are commercially available and their reaction conditions, cofactors and other 
requirements were used as would be known to the ordinarily skilled artisan. For 
analytical purposes, typically 1 ng of plasmid or DNA fragment is used with about 2 

25 units of enzyme in about 20 \i\ of buffer solution. For the purpose of isolating DNA 
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fragments for plasmid construction, typically 5 to 50 \xg of DNA are digested with 20 to 
250 units of enzyme in a larger volume. Appropriate buffers and substrate amounts for 
particular restriction enzymes are specified by the manufacturer. Incubation times of 
about 1 hour at 37°C are ordinarily used, but may vary in accordance with the supplier's 
5 instructions. After digestion the reaction is electrophoresed directly on a polyacrylamide 
gel to isolate the desired fragment. 

Size separation of the cleaved fragments is performed using 8 percent polyacrylamide 
gel described by Goeddel, D. etaL, Nucleic Acids Res., 8:4057 (1980). 

"Oligonucleotides" refers to either a single stranded polydeoxynucleotide or two 
10 complementary polydeoxynucleotide strands which may be chemically synthesized. 
Such synthetic oligonucleotides have no 5' phosphate and thus will not ligate to another 
oligonucleotide without adding a phosphate with an ATP in the presence of a kinase. A 
synthetic oligonucleotide will ligate to a fragment that has not been dephosphorylated. 

"Ligation" refers to the process of forming phosphodiester bonds between two double 
15 stranded nucleic acid fragments (Maniatis, T., et al., Id., p. 146). Unless otherwise 
provided, ligation may be accomplished using known buffers and conditions with 10 
units of T4 DNA ligase ("ligase 1 ') per 0.5 jig of approximately equimolar amounts of the 
DNA fragments to be ligated. 

Unless otherwise stated, transformation was performed as described in the method of 
20 Graham, F. and Van der Eb, A., Virology, 52:456-457 (1973). 

Example 1 

Bacterial Expression and Purification of Glycosidase Enzymes 

DNA encoding the enzymes of the present invention, SEQ ID NOS: 1-14 and 57-60 were 
initially amplified from a pBluescript vector containing the DNA by the PCR technique 
25 using the primers noted herein. The amplified sequences were then inserted into the 
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respective PQE vector listed beneath the primer sequences, and the enzyme was 
expressed according to the protocols set forth herein. The 5' and 3' primer sequences for 
the respective genes are as follows: 

Thermococcus AEDII12RA -18B/G 

5 5' CCGAGAATTCATTAAAGAGGAGAAATTAACTATGGTGAATGCTATGATTGTC 3' (SEQ ID NO:29) 
3' CGGAAGATCTTCATAGCTCCGGAAGCCCATA 5' (SEQ ID NO:30) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRJ and 3' 
Big II. 

OC1/4V-33B/G 

1 0 5' CCGAGAATTCATTAAAGAGGAGAAATTAACTATGATAAGAAGGTCCGATTTTCC 3' 
(SEQ ID NO:31) 

3' CGGAAGATCTTTAAGATTTTAGAAATTCCTT 5' (SEQ ID NO:32) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' 

Bgin. 

1 5 Thermococcus 9N2 - 3 1 B/G 

5' CCGAGAATTCATTAAAGAGGAGAAATTAACTATGCTACCAGAAGGCTTTCTC 3* 
(SEQIDNO:33) 

3* CGGAGGTACCTCACCCAAGTCCGAACTTCTC 5' (SEQ ID NO:34) 

Vector: pQE30; and contains the following restriction enzyme sites 5' EcoRI and 3' 
20 KpnI. 

Staphylothermus marinus Fl - 12G 

5* CCGAGAArrCATrAAAGAGGAGAAATTAACTATGATAAGGTTTCCTGATTAT 3' 
(SEQ IDNO:35) 

3* CGGAAGATCTTTATTCGAGGTTCTTTAATCC 5' (SEQ IDNO:36) 

25 Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' 
Bglll. 

Thermococcus chitonophagus GC74 - 22G 

5' CCGAGAATTCATTCATTAAAGAGGAGAAATTAACTATGCTTCCAGGAGAACTTTCTC 3' 
(SEQ IDMO:37) 
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3' CGGAGGATCCCTACCCCTCCTCTAAGATCTC 5' (SEQ IDNO:38) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3* 
BamHI. 

M11TL 

5 5' AATAATCTAGAGCATGCAATTCCCCAAAGACTTCATGATAG 3* (SEQ ID NO:39) 
3* AATAAAAGCTTACTGGATCAGTGTAAGATGCT 5' (SEQ ID NO:40) 

Vector: pQE70; and contains the following restriction enzyme sites 5* Sphl and 3' 
Hind m. 

Thermotoga maritima MSB8-6G 

1 0 5' CCGACAATTGATTAAAGAGGAGAAATTAACTATGGAAAGGATCGATGAAATT 3' (SEQ ID NO:4 1 ) 
3* CGGAGGTACCTCATGGTTTGAATCTCTTCTC 5' (SEQ ID NO:42) 

Vector: pQE12; and contains the following restriction enzyme sites 5* EcoRI and 3' 
KpnI. 

Pyrococcus furiosus VC1 - 7G1 

1 5 5* CCGACAATTGATTAAAGAGGAGAAATTAACTATGTTCCCTGAAAAGTTCCTT 3' (SEQ ID NO:43) 
3' CGGAGGTACCTCATCCCCTCAGCAATTCCTC 5' (SEQ ID NO:44) 

Vector: pQE12; and contains the following restriction enzyme sites 5 ! EcoRI and 3 1 
Kpn I. 

Bankia gouldi endoglucanase (37GP1) 

20 5' AATAAGGATCCGTTTAGCGACGCTCGC 3' (SEQ ID NO:45) 

3' AATAAAAGCTTCCGGGTTGTACAGCGGTAATAGGC 5* (SEQ ID NO:46) 

Vector: pQE52; and contains the following restriction enzyme sites 5 1 Bam HI and 3' 
Hind HI. 
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Thermotoga maritima oc-galactosidase (6GC2) 

5' TTTATTGAATTCATTAAAGAGGAGAAATTAACTATGATCTGTGTGGAAATATTCGGAAAG 3' 
(SEQIDNO:47) 

3' TCTATAAAGCTTTCATTCTCTCTCACCCTCTTCGTAGAAG 5' (SEQ ID NO:48) 

5 Vector: pQET; and contains the following restriction enzyme sites 5' EcoRI and 3' 
Hind HI. 

Thermotoga maritima 6-mannanase (6GP2) 

5' TTTATTCAATTGATTAAAGAGGAGAAATTAACTATGGGGATTGGTGGCGACGAC 3' 
(SEQ ID NO:49) 

1 0 3' TTTATTAAGCTTATCTTTTCATATTCACATACCTCC 5' (SEQ ID NO:50) 

Vector: pQEt; and contains the following restriction enzyme sites 5' Hind III and 3' 
EcoRI. 

AEPIIla ft-mannanase(63GBl) 

5' TTTATTGAATTCATTAAAGAGGAGAAATTAACTATGCTACCAGAAGAGTTCCTATGGGGC 3' 
15 (SEQIDNO:51) 

3* rTTATTAAGCTTCTCATCAACGGCTATGGTCTTCAriTC 5' (SEQ ID NO:52) 

Vector: pQEt; and contains the following restriction enzyme sites 5' Hind III and 3' 
EcoRI. 

OC1/4V endoglucanase (33GP1) 
20 5' 

AAAAAACAATTGAATTCATTAAAGACiGAGAAATTAACTATGGTAGAAAGACACTTCAGATATGTTCT 
T3' (SEQIDNO:53) 

3' TTTTTCGGATCCAATTCTTCATTTACTCTTTGCCTG 5' (SEQ ID NO:54) 

Vector: pQEt; and contains the following restriction enzyme sites 5 f BamHI and 3' 
25 EcoRI. 

Thermotoga maritima pullalanase (6GP3) 

5' TTTTGGAATTCATTAAAGAGGAGAAATTAACTATGGAACTGATCATAGAAGGTrAC 3' 
(SEQ IDNO:55) 
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3' ATAAGAAGCTTTTCACTCTCTGTACAGAACGTACGC 5' (SEQ ID NO:56) 

Vector: pQEt; and contains the following restriction enzyme sites 5' EcoRI and 3* 
Hind in. 

The restriction enzyme sites indicated correspond to the restriction enzyme sites on the 
5 bacterial expression vector indicated for the respective gene (Qiagen, Inc. Chatsworth, 
CA). The pQE vector encodes antibiotic resistance (Amp 1 ), a bacterial origin of 
replication (ori), an IPTG-regulatable promoter operator (P/O), a ribosome binding site 
(RBS), a 6-His tag and restriction enzyme sites. 

The pQE vector was digested with the restriction enzymes indicated. The amplified 
10 sequences were ligated into the respective pQE vector and inserted in frame with the 
sequence encoding for the RBS. The ligation mixture was then used to transform the E. 
coH strain M15/pREP4 (Qiagen, Inc.) by electroporation. M15/pREP4 contains multiple 
copies of the plasmid pREP4, which expresses the lad repressor and also confers 
kanamycin resistance (Kan 1 ). Transformants were identified by their ability to grow on 
1 5 LB plates and ampicillin/kanamycin resistant colonies were selected. Plasmid DNA was 
isolated and confirmed by restriction analysis. Clones containing the desired constructs 
were grown overnight (O/N) in liquid culture in LB media supplemented with both Amp 
(100 ug/ml) and Kan (25 ug/ml). The O/N culture was used to inoculate a large culture 
at a ratio of 1 : 100 to 1 :250. The cells were grown to an optical density 600 (O.D. 600 ) of 
20 between 0.4 and 0.6. IPTG ("Isopropyl-B-D-thiogalacto pyranoside") was then added 
to a final concentration of 1 mM. IPTG induces by inactivating the lad repressor, 
clearing the P/O leading to increased gene expression. Cells were grown an extra 3 to 
4 hours. Cells were then harvested by centrifiigation. 

The primer sequences set out above may also be employed to isolate the target gene from 
25 the deposited material by hybridization techniques described above. 
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Example 2 

Isolation of A Sele cted Clone From the Deposited genomic clones 

A clone is isolated directly by screening the deposited material using the 
oligonucleotide primers set forth in Example 1 for the particular gene desired to be 
5 isolated. The specific oligonucleotides are synthesized using an Applied Biosystems 
DNA synthesizer. The oligonucleotides are labeled with 32 P- -ATP using T4 
polynucleotide kinase and purified according to a standard protocol (Maniatis et al., 
Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, Cold Spring, 
NY, 1 982). The deposited clones in the pBluescript vectors may be employed to 

1 0 transform bacterial hosts which are then plated on 1 .5% agar plates to the density of 
20,000-50,000 pfu/1 50 mm plate. These plates are screened using Nylon membranes 
according to the standard screening protocol (Stratagene, 1 993). Specifically, the 
Nylon membrane with denatured and fixed DNA is prehybridized in 6 x SSC, 20 mM 
NaH 2 P0 4 , 0.4%SDS, 5 x Denhardt's 500 ng/ml denatured, sonicated salmon sperm 

15 DNA; and 6 x SSC, 0.1% SDS. After one hour of prehybridization, the membrane is 
hybridized with hybridization buffer 6xSSC, 20 mM NaH 2 P0 4 , 0.4%SDS, 500 ug/ml 
denatured, sonicated salmon sperm DNA with lxlO 6 cpm/ml 32 P-probe overnight at 
42°C. The membrane is washed at 45-50°C with washing buffer 6 x SSC, 0.1% SDS 
for 20-30 minutes dried and exposed to Kodak X-ray film overnight. Positive clones 

20 are isolated and purified by secondary and tertiary screening. The purified clone is 
sequenced to verify its identity to the primer sequence. 

Once the clone is isolated, the two oligonucleotide primers corresponding to the gene 
of interest are used to amplify the gene from the deposited material. A polymerase 
chain reaction is carried out in 25 \i\ of reaction mixture with 0.5 ug of the DNA of 
25 the gene of interest. The reaction mixture is 1 .5-5 mM MgCl 2 , 0.0 1 % (w/v) gelatin, 
20 [iM each of dATP, dCTP, dGTP, dTTP, 25 pmol of each primer and 0.25 Unit of 
Taq polymerase. Thirty five cycles of PCR (denaturation at 94 °C for 1 min; 
annealing at 55 °C for 1 min; elongation at 72°C for 1 min) are performed with the 
Perkin-Elmer Cetus automated thermal cycler. The amplified product is analyzed by 
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agarose gel electrophoresis and the DN A band with expected molecular weight is 
excised and purified. The PCR product is verified to be the gene of interest by 
subcloning and sequencing the DNA product. The ends of the newly purified genes 
are nucleotide sequenced to identify full length sequences. Complete sequencing of 
5 full length genes is then performed by Exonuclease III digestion or primer walking. 

Example 3 
Screening for Galactosidase Activity 

Screening procedures for a-galactosidase protein activity may be assayed for as 
follows: 

1 0 Substrate plates were provided by a standard plating procedure. Dilute 

XLl-Blue MRF E coli host of (Stratagene Cloning Systems, La Jolla, CA) to O.D. m 
= 1 .0 with NZY media. In 15 ml tubes, inoculate 200 jul diluted host cells with phage. 
Mix gently and incubate tubes at 37 °C for 15 min. Add approximately 3.5 ml LB top 
agarose (0.7%) containing ImM IPTG to each tube and pour onto all NYZ plate 

1 5 surface. Allow to cool and incubate at 37 °C overnight. The assay plates are 

obtained as substrate p-Nitrophenyl a-galactosidase (Sigma) (200 mg/100 ml) (100 
mM NaCl, 100 mM Potassium-Phosphate) 1% (w/v) agarose. The plaques are 
overlayed with nitrocellulose and incubated at 4 °C for 30 minutes whereupon the 
nitrocellulose is removed and overlayed onto the substrate plates. The substrate 

20 plates are then incubated at 70 °C for 20 minutes. 

Example 4 

Screening of Clones for Mannanase Activity 

A solid phase screening assay was utilized as a primary screening method to test 
clones for B-mannanase activity. 

25 A culture solution of the Y1090-E coli host strain (Stratagene Cloning Systems, La 
Jolla, CA) was diluted to O.D. 600 =1 .0 with NZY media. The amplified library from 
Thermotoga maritima lambda gtll library was diluted in SM (phage dilution buffer): 
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5 x 1 0 7 pfu/nl diluted 1 ; 1 000 then 1 : 1 00 to 5 x 1 0 2 pfu/ul Then 8 \xl of phage 
dilution (5x 10 2 pfu/^l) was plated in 200 p,l host cells. They were then incubated in 
15 ml tubes at 37 °C for 15 minutes. 

Approximately 4 ml of molten, LB top agarose (0.7%) at approximately 52 °C was 
5 added to each tube and the mixture was poured onto the surface of LB agar plates. 
The agar plates were then incubated at 37 °C for five hours. The plates were 
replicated and induced with 10 mM IPTG-soaked Duralon-UV™ nylon membranes 
(Stratagene Cloning Systems, La Jolla, CA) overnight. The nylon membranes and 
plates were marked with a needle to keep their orientation and the nylon membranes 
10 were then removed and stored at 4 °C. 

An Azo-galactomannan overlay was applied to the LB plates containing the lambda 
plaques. The overlay contains 1% agarose, 50 mM potassium-phosphate buffer pH 7, 
0.4% Azocarob-galactomannan. (Megazyme, Australia). The plates were incubated 
at 72 °C. The Azocarob-galactomannan treated plates were observed after 4 hours 
1 5 then returned to incubation overnight. Putative positives were identified by clearing 
zones on the Azocarob-galactomannan plates. Two positive clones were observed. 

The nylon membranes referred to above, which correspond to the positive clones 
were retrieved, oriented over the plate and the portions matching the locations of the 
clearing zones for positive clones wre cut out. Phage was eluted from the membrane 
20 cut-out portions by soaking the individual portions in 500 \il SM (phage dilution 
buffer) and 25 \il CHC1 3 . 

Example 5 

Screening of Clones for Mannosidase Activity 

A solid phase screening assay was utilized as a primary screening method to test 
25 clones for 8-mannosidase activity. 

W 
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A culture solution of the Y1090-E coli host strain (Stratagene Cloning Systems, La 
Jolla, CA) was diluted to O.D. 600 =1 .0 with NZY media. The amplified library from 
AEPII la lambda gtll library was diluted in SM (phage dilution buffer): 5 x 10 7 
pfu/^1 diluted 1:1000 then 1:100 to 5 x 10 2 pfu/jil. Then 8 fil of phage dilution 
5 (5 x 1 0 2 pfu/nl) was plated in 200 nl host cells. They were then incubated in 1 5 ml 
tubes at 37 °C for 15 minutes. 

Approximately 4 ml of molten, LB top agarose (0.7%) at approximately 52 °C was 
added to each tube and the mixture was poured onto the surface of LB agar plates. 
The agar plates were then incubated at 37 °C for five hours. The plates were 
1 0 replicated and induced with 1 0 mM IPTG-soaked Duralon-U V™ nylon membranes 
(Stratagene Cloning Systems, La Jolla, CA) overnight. The nylon membranes and 
plates were marked with a needle to keep their orientation and the nylon membranes 
were then removed and stored at 4 °C. 

A p-nitrophenyl-B-D-manno-pyranoside overlay was applied to the LB plates 
15 containing the lambda plaques. The overlay contains 1% agarose, 50 mM potassium- 
phosphate buffer pH 7, 0.4% p-nitrophenyl-B-D-manno-pyranoside. (Megazyme, 
Australia). The plates were incubated at 72 °C. The p-nitrophenyl-B-D-manno- 
pyranoside treated plates were observed after 4 hours then returned to incubation 
overnight. Putative positives were identified by clearing zones on the p-nitrophenyl- 
20 B-D-manno-pyranoside plates. Two positive clones were observed. 

The nylon membranes referred to above, which correspond to the positive clones 
were retrieved, oriented over the plate and the portions matching the locations of the 
clearing zones for positive clones wre cut out. Phage was eluted from the membrane 
cut-out portions by soaking the individual portions in 500 ^1 SM (phage dilution 
25 buffer) and 25 ^1CHC1 3 . 
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Example 6 
Screening for Pullulanase Activity 

Screening procedures for pullulanase protein activity may be assayed for as follows: 
Substrate plates were provided by a standard plating procedure. Host cells 
5 are diluted to O.D. 600 = 1 .0 with NZY or appropriate media. In 1 5 ml tubes, inoculate 
200 iA diluted host cells with phage. Mix gently and incubate tubes at 37 °C for 1 5 
min. Add approximately 3.5 ml LB top agarose (0.7%) is added to each tube and the 
mixture is plated, allowed to cool, and incubated at 37°C for about 28 hours. 
Overlays of 4.5 mis of the following substrate are poured: 

10 100 ml total volume 



0.5g Red Pullulan Red (Megazyme, Australia) 

1 .0g Agarose 

5ml Buffer (Tris-HCL pH 7.2 @ 75 °C) 

2ml 5MNaCl 
15 5ml CaCl 2 (100mM) 

85ml dH 2 0 
Plates are cooled at room temperature, and thenm incubated at 75 °C for 2 hours. 
Positives are observed as showing substrate degradation. 

Example 7 

20 Screening for Endoglucanase Activity 



Screening procedures for endoglucanase protein activity may be assayed for as 
follows: 

1 . The gene library is plated onto 6 LB/Gel Rite/0. 1 % CMC/NZY agar plates 

25 (-4,800 plaque forming units/plate) in E.coli host with LB agarose as top agarose. 
The plates are incubated at 37 °C overnight. 

2. Plates are chilled at 4°C for one hour. 

3. The plates are overlayed with Duralon membranes (Stratagene) at 
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room temperature for one hour and the membranes are oriented and lifted off the 
plates and stored at 4°C. 

4. The top agarose layer is removed and plates are incubated at 37 °C 
for -3 hours. 

5 5 . The plate surface is rinsed with NaCl. 

6. The plate is stained with 0.1% Congo Red for 15 minutes. 

7. The plate is destained with 1M NaCl. 

8. The putative positives identified on plate are isolated from the 
Duralon membrane (positives are identified by clearing zones around clones). The 

10 phage is eluted from the membrane by incubating in 500|il SM + 25|il CHC1 3 to elute. 

9. Insert DNA is subcloned into any appropriate cloning vector and 
subclones are reassayed for CMCase activity using the following protocol: 

i) Spin 1 ml overnight miniprep of clone at maximum speed 

for 3 minutes. 

1 5 ii) Decant the supernatant and use it to fill "wells" that have 

been made in an LB/GelRite/0.1% CMC plate. 

iii) Incubate at 37 °C for 2 hours. 

iv) Stain with 0.1% Congo Red for 1 5 minutes. 

v) Destain with 1 M NaCl for 1 5 minutes. 

20 vi) Identify positives by clearing zone around clone. 

Numerous modifications and variations of the present invention are possible in light 
of the above teachings and, therefore, within the scope of the appended claims, the 
invention may be practiced otherwise than as particularly described. 
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WHAT IS CLAIMED IS : 

1 . An isolated polynucleotide selected from the group consisting of: 

(a) SEQ ID NOS: 1-14 and 57-60; 

(b) SEQ ID NOS: 1 -1 4 and 57-60, wherein T can also be U; 

(c) polynucleotide sequences complementary to SEQ ID NOS: 1-14 
and 57- 60; 

(d) polynucleotide sequences which encode an amino acid sequence as 
set forth in SEQ ID NOS: 15-28, and 61-64; and 

(e) fragments of (a), (b), (c) or (d) that are at least 1 5 consecutive 
bases in length and that will selectively hybridize to DNA which 
encodes a polypeptide of SEQ ID NOS:15-28, and 61-64. 

2. A vector comprising a polynucleotide of claim 1 . 

3. A host cell containing the vector of claim 2. 

4. The method of claim 3, wherein the host cell is a eukaryotic cell. 

5. The method of claim 3, wherein the host cell is a prokaryotic ceil. 

6. A method for producing a polypeptide comprising: 

(a) culturing the host cells of claim 3 ; 

(b) expressing from the host cell of claim 3 a polypeptide encoded by 
said polynucleotide; and 

(c) isolating the polypeptide. 
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7. An enzyme selected from the group consisting of: 

(a) an enzyme comprising an amino acid sequence set forth in SEQ ID 
NOS: 15-28 or 61-64; and 

(b) an enzyme which comprises at least 30 consecutive amino acid 
residue as an enzyme of (a). 



8. An enzyme of which at least a portion is coded for by a polynucleotide of 
claim 1 , and which is selected from the group consisting of; 

(a) an enzyme comprising an amino acid sequence which is at least 
70% identical to an amino acid sequence selected from the group 
of amino acid sequences set forth in SEQ ID NOS: 1 5-28 or 61-64; 
and 

(b) an enzyme which comprises at least 30 amino acid residues to the 
enzyme of (a). 

9. A method for generating glucose from soluble cell oligosaccharides 
comprising contacting a sample containing oligosaccharides with an 
effective amount of an enyzme selected from the group consisting of an 
enzyme having the amino acid sequence set forth in SEQ ID NOS: 15-28, 
6 1 -63 and 64 such that glucose is produced. 

1 0. The method of cliam 9, wherein the sample is selected from the group 
consisting of dairy products, fruit juices, detergents, textiles, guar gum, 
animal feed, plant biomass and waste products. 

1 1 . The method of claim 9, wherein the oligosaccharide is selected from the 
group consisting of maltose, cellobiose, lactose, sucrose, raffinose, 
stachyose, verbascose, cellulose, starch, amylose, glycogen, disacharrides, 
polysacharrides and pullulan. 
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Thermo toga aaritiaa p-aannasase (£OJIft 

9 18 2? 36 45 54 

5* ATG GGG ATT GGT GGC GAC GAC TCC TGC AGC CCG TCA CTA TCG CCG GAA TTC CTT 

Met Gly II* Gly Gly Asp Asp Ser Trp Ser Pro Ser Val Ser Ala Glu Phe Leu 

63 72 81 90 99 10B 

TTA TTG ATC GTT GAG CTC TCT TTC GTT CTC TTT GCA ACT GAC GAG TTC CTG AAA 

Leu Leu lie Val Glu Leu Ser Phe Val Leu Phe Ala Ser Asp Glu Phe Val Lys 

117 126 135 144 153 162 

GTG GAA AAC GGA AAA TTC GCT CTG AAC GGA AAA GAA TTC AGA TTC ATT GGA AGC 

Val Glu Asa Gly Lys Phe Ala Leu Asn Gly Lys Glu Phe Arg Phe He Gly Ser 

171 1B0 189 198 207 216 

AAC AAC TAC TAC ATG CAC TAC AAG AGC AAC GGA ATG ATA GAC ACT GTT CTG GAG 

.Asn Asn Tyr Tyr Met His Tyr Lys Ser Asn Gly Mac He Asp Ser Val Leu Glu 

225 234 243 252 261 270 

AGT GCC AGA GAC ATO GGT ATA AAG GTC CTC AGA ATC TGG GGT TTC CTC GAC GGG 

Ser Ala Arg Asp Met Gly He Lys Val Leu Keg He. Trp Gly Phe Leu Asp Gly 

279 288 297 306 315 324 

GAG AGT TAC TGC AGA GAC AAG AAC ACC TAC ATG CAT CCT GAG CCC GGT GTT TTC 

Glu Sex Tyr Cys Arg Asp Lys Asn Thr Tyr Met His Pro Glu Pro Gly Val Phe 

333 342 3S1 360 369 378 

GGG GTG CCA GAA GGA ATA TCG AAC GCC CAG AGC GGT TTC GAA AGA CTC GAC TAC 

Gly Val Pro Glu Gly He Ser Asn Ala Gin Ser Gly Phe Glu Arg Leu Asp Tyr 

387 396 405 414 423 432 

ACA GTT GCG AAA GCG AAA GAA CTC GGT ATA AAA CTT GTC ATT GTT CTT GTG AAC 

Thr Val Ala Lys Ala Lys Glu Leu Gly He Lys Leu Val He Val Leu Val Asn 

441 450 459 468 477 486 

AAC TGG GAC GAC TTC GGT GGA ATG AAC CAG TAC GTG AGG TGG TTT GGA GGA ACC 

Asn Trp Asp Asp Phe Gly Gly Met Asn Gin Tyr Val Arg Trp Phe Gly Gly Thr 

495 504 513 522 531 540 

CAT CAC GAC GAT TTC TAC AGA GAT GAG AAG ATC AAA GAA GAG TAC AAA AAG TAC 

His His Asp Asp Phe Tyr Arg Anp Clu Lya He Lys Glu Glu Tyr Lys Lys Tyr 

Figure 11a. 
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Tbersotoga, maritime P-manaaaaae CftMVf- (continued) 

549 558 567 576 5B5 594 

GTC TCC TTT CTC GTA AAC CAT GTC AAT ACC TAC ACG GGA GTT CCT TAC AGG GAA 

Val Ser Phe Leu Val Aan His Val Asn Thr Tyr Thr Cly Vol Pro Tyr Arg Glu 

603 612 621 630 639 648 

GAG CCC ACC ATC ATG GCC TCG GAG CTT GCA AAC GAA CCG CCC TGT GAG ACG GAC 

Glu Pro Thr lie Met Ala Trp Glu Leu Ala Asn Glu Pro Arg Cys Glu Thr Asp 

657 666 675 684 693 702 

AAA TCG GGG AAC ACG CTC CTT GAG TCG GTG AAG GAG ATC AGC TCC TAC ATA AAG 

Lya Ser Gly Aan Thr Leu Val Glu Trp Val Lys Glu Met Ser Ser Tyr lie Lya 

711 720 729 738 747 756 

ACT CTG GAT CCC AAC CAC CTC GTG GCT GTG GGG GAC GAA GGA TTC TTC AGC AAC 

Ser Leu Asp Pro Asn Hia Leu Val Ala Val Gly Asp Glu Gly Phe Phe Ser Asn 

765 774 783 792 801 810 

TAC GAA GGA TTC AAA CCT TAC GGT GGA GAA GCC GAG TGG GCC TAC AAC GGC TGG 

Tyr Glu Gly Phe Lys Pro Tyr Gly Gly Glu Ala Glu Trp Ala Tyx A an Gly Trp 

819 828 837 846 855 864 

TCC GGT GTT GAC TGG AAG AAG CTC CTT TCG ATA GAG ACG GTG GAC TTC GGC ACG 

Ser Gly Val Asp Trp Lya Lys Leu Leu Ser He Glu Thr Val Asp Phe Gly Thr 

873 882 891 900 909 918 

TTC CAC CTC TAT CCG TCC CAC TGG GGT GTC AGT CCA GAG AAC TAT GCC CAG TGG 

Phe His Leu Tyr Pro Ser His Trp Gly Val Ser Pro Glu Asn Tyr Ala Gin Trp 

927 936 945 954 963 972 

GGA GCG AAG TGG ATA GAA GAC CAC ATA AAG ATC GCA AAA GAG ATC GGA AAA CCC 

Gly Ala Lys Trp lie Glu Asp His He Lys lie Ala Lys Glu He Gly Lys Pro 

981 990 999 1008 1017 1026 

GTT GTT CTG GAA GAA TAT GGA ATT CCA AAG AGT GCG CCA GTT AAC AGA ACG GCC 

Val Val Leu Glu Glu Tyr Gly He Pro Lys Ser Ala Pro Val Asn Arg Thr Ala 

1035 1044 1053 1062 1071 1080 

ATC TAC AGA CTC TGG AAC GAT CTG GTC TAC GAT CTC GGT GGA GAT GGA GCG ATG 

He Tyx Arg Leu Trp Asn Asp Leu Val Tyx Asp Leu Gly Gly Asp Gly Ala Met 

Figure lib (Continued) 
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mK la P-»*ano»id«»» (€30»1> (continued) 

549 558 567 576 5B5 594 

-!? ™ !!! ?H ^ m CCC MG TAT 001 007 TAC AT C GCC CAT GCG CTC GGA 
Arg Thr Val Val Glu Phe Ala Lys Tyr Ala Ala Tyr HI Ma His III Leu Gly 

603 612 621 630 639 64B 

GAC CTC GTG GAC ACA TGG AGC ACC TTC AAC GAA CCT ATG GTA GTT GTG GAG CTC 

Asp Leu Val Asp Thr Trp Ser Thr Ph. Asn Glu Pro Het Val Val Val Glu L^u 

657 666 675 684 693 702 

GGC TAC CTC GCC CCC TAC TCA GGA TTT CCC CCG GGA GTC ATG AAC CCC GAG GCC 

Gly Tyr Leu Ala Pro Tyr Ser Gly Phe Pro Pro Gly Val Met Asn Pro Glu Ala 

711 720 729 738 747 755 

GCG AAG CTG GCG ATC CTC AAC ATG ATA AAC CCC CAC GCC TTG GCA TAT AAG ATG 

Ala Lys Leu Ala Ila Leu Asn Mae lit Asn Ala His Ala Leu Ala Tyr Lys Met 

765 774 7 " 792 801 810 

ATA AAG AGG TTC GAC ACC AAG AAG GCC GAT GAG GAT AGC AAG TCC CCT GCG GAC 

He Lys Arg Phe Asp Thr Lys Lys Ala Asp Glu Asp Ser Lys Ser Pro All Asp 

«" 828 837 846 855 864 

GTT GGC ATA ATT TAC AAC AAC ATC GGT GTT GCC TAC CCT AAA GAC CCT AAC GAT 

Val Gly ll« He Tyr Asn Asn Ila Gly Val Ala Tyr Pro Lys Asp Pro Asn j*p 

873 882 891 900 909 918 

CCC AAG GAC GTT AAA GCA GCC GAA AAC GAC AAC TAC TTC CAC ACC GGA CTG TTC 

Pro Lys Asp Val Lys Ala Ala Glu Asn Asp Asn Tyr Phe His Ser Gly L^u Phe 

927 536 945 954 963 972 

TTT GAT GCC ATC CAC AAG GGT AAG CTC AAC ATA GAG TTC GAC GGC GAA AAC TTT 

Phe Asp Ala He His Lys Gly Lys Leu Asn Ila Glu Phe Asp Gly Glu Asn Phe 

981 990 9*5 100B 1017 1026 

GTA AAA GTT AGA CAC CTA AAA GGC AAT GAC TGG ATA GGC CTC AAC TAC TAC ACC 

Val Lys Val Arg His Leu Lys Gly Asn Asp Trp He Gly Leu Asn Tyr T^r Th^ 

1035 1044 1053 1062 1071 1080 

CGC GAC GTT GTT AGA TAT TCC GAG CCC AAG TTC CCA AGT ATA CCC CTC ATA TCC 

Arg Glu Val Val Arg Tyr Ser Glu Pro Lys Phe Pro Ser He Pro Leu Ila Ser 

Figure 12b( Continued ) 
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Figure 14b (Continued) 
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« Trp „„ ly . ta _ s „ n . Mu ^ - ™ « ACG « 

~ z: z z :z z « » 

Hi. Trp Gly „1 S „ ,„ A „ ^ ^ ^ 

~ ~ £ ~ z ~ r r a,a - - - •» « « « - 

Trp 01n Asp „. ii. ly , „. Ala Glu „. My 

s s s ™ ~ z z z z r 7 - ~ « « - 

yr Gly He Pro Lys Ser Ala Pro Val Asn Arg 

^^zzzzzzzzzzzz 

GAG AGA OGG TAC TAT CCG GAC TAC GAC GOT TTC AGA it„ 

- - ^ * P ro ASP Tyr Asp Gly Z 2 2 Z Z Z Z 

- z: 2 z: z z: ? r gaa tac gcg ^ ctg « - - « 

Olu Leu I le Arg Glu Tyr Ala Lys Leu phfi ^ ^ ^ 

GAA GAC ATA AGA GAA GAC ACC TGC TCT TTC ATC CTT Cr. , 
«« A S p Ilt Arg 01u ^ TC ^ "T CCA AAA GAC GGC ATG 

*>er Phe lie Leu Pro Lys Asp Gly Met 

« « « «C ACC GTG GAA GTG „ ccr GGT GTT TTC GAC TAC « AAC 



Figure 15b (continued) 
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Glu lie Lys Lys T hr Val Glu Val Arg Ala Gly Val Phe A*n ^ c 

y val Phe Asp Tyr Ser Asn 

ACG TTT GAA AAG TTG TCT GTC AAA GTC GAA GAT CTG G rr ^ 

- - - ly . ^ s „ w Ly . vsl Glu - 

7 ^ Gly Ue Tyr G1 y »* Asp Leu Asp Thr Thr Arg 

Glu Met Phe Leu Glu Gly His Phe Gin ci y Ly8 
ACG GTG AAA GAC TCT ATC AAA err ™~ 

Thr Val Lys Asp ser nl T T ^ ^ GCA ™ ^ 

y P S " I1C LyS A i a ?*■ Val ™ Asn Glu Ala Arg Tyr Val 

CTC GCA GAG GAA GTT GAT TTT Tnn ^ 

^ Ala Glu Glu VaT Z Phe 1 7 ^ "° TGG TGG 

Val Asp Phe ser Ser Pro Glu Glu Val Lys Asn Trp Trp 

AAC AGC GGA ACC TGG CAG GCA GAG TTC GGG TCA CCT GAC att M 
Asn Ser Gly Thr m, Gln B1 „, ^ ATT GAA TGG AAC 

V Trp Gin Ala Glu Phe Gly Ser Pro Asp H e Glu Trp Asn 

GOT GAG GTG GGA AAT GGA GCA CTG CAG CTG AAC GTG AAA rm n 

Olu val Gly Asn G!y Ala Leu Cln Leu Z Z £ Leu Z T y Z 

s z z z z r 71 ™ * ccc - - - - - « 

Arg Pro Tyr Ala Val Leu Asn Pro Gly Trp Val Lys lie Gly 
CTC GAC ATG AAC AAC GCG AAC err r»„ ,~. „„ 

^eu Asp Met Asn Asn Ai T ° ^ ATC ATC ACT WC «C 

P Asn Asn Ala Asn Val Glu Ser Ala Glu He Ile Thr Phe Gly 

GGA AAA GAG TAC AGA AGA TTC CAT GTA AGA ATT GAC rr, 

* - * 9 ^ „ „ lt ; % z z zzzz 



Figure L5C( continued) 
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Z ~ - 2 Z Z z 7 " " - ~ - 

Y Val Val G1 y Asp His Leu Arg Tyr Asp 

" Z " " " r « 86 T ATO 

Ph. „. „,„ A „ v>1 ^ Uu ^ ^ ^ ^ ^ ^ ^ 

TGA 1991 
END 



Figure 15d(continued) 
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Figure no. l^Thennotoga mari t i na KSB8(« 9 b4, 

" wy lie Trp Lys Pro Val Tyr Leu Glu 
A 8 p Ser Thr Ala Tyr Leu Leu Glu Leu Glu Gly Lys Asp 

zzzzz ~ r 5 z z z r r - - «««—«. - 

/ Phe Val his Gly Glu Gly Aan Leu lie Val Glu Val Tyr 
!" CTA MC 007 «A AAG ATA GGG GAG TTT CCT GTT ptt r 

~ ~ - - - ... ^ ~ s : r - - ~ ~ - z z 

721 °AT GGA GTG TTC CAC rrr * ftR 
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781 TAC CTG TAC GAT TTC GTT TTC rrr ™ * 
1081 TAC AGA CTC TGT GAT r»a 

iy «.t v.i Trp M> „„ ^ u _ ^ ^ ^ 

-141 GAA TAT CCG GAT CAT CTT rrn m„„ 

- Tyr Pro A5P z z ^ ™ r r ocg " c gaa gag gca ** - - - 

Trp Pne Ar3 , y8 Leu Ala Asn Glu Glu ^ ^ ^ ^ ^ 

1201 GTG AGA AAA rrr »™ 

.« ~ ., " s: s * z ~ ~ s: ™ ™ r ,oc ro - « « - « « u.. 

lie Val Leu Trp c ys Gly Asn Asn Glu Asn A . n 420 

« zzz z ~ ™ 5 ™ r r - « ™ 

xy Aan Met Ala Arg Lys Val Asn t-i. » 

x vax Asp Gly u e Asn Leu Gly Asn 440 

1501 TTC ATC AGC GAG TTT GGA TTT nr m 
501 «* He Ser Bu Phe £ ™ £ £ ^ " C « -A GAG ACO A TA GAG TTC TTT TCA 1S6 0 

" Pr ° P - C1 " Thr He Glu Phe Pne Ser 520 

- - - 2 2 z z r: z z ~ « - « - - - ^ CAG « GAA „„ 

Hxs Pro v« Met Leu Lys His Asn ^ 



Figure 16b(concinued) 
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1680 
S60 



"21 GGA CAG GAA AGA TTG ATC AGO TTC ATA TTC GGA a»t »m 
"1 Oly Gin Glu ^ ^ 1U £ £ ™ OS* AAT TTT GGA AAC TGT AAA GAT TTC GAC 

Phe lie P he G1 y A s„ P» e 01y Ly8 ^ , yfl ^ ^ ^ 

1861 GCG AQA AGA TTC TTC GOT ~ 

"I to Fr. v.i to ^ Ly. «rg «p „„ ly , „, Mn 

~ ===:=====:: ====== - == - 

1981 CGA GAA GAA COO AQA AAA OGT ATT CGA A S1 ™. 
«1 Ar 9 Glu «„ Gly ^ LyB Gly T *™ ^ AAC GGT ACT CCC AGO AGA CGG 20 «0 

-3 y Oly He Arg Lys Asp Leu Gin Asn Gly Thr Pro Ser Ar 9 Ar 9 680 



1740 
580 

1800 
600 

I860 
620 

1920 
640 



2041 TGT GAG TTT GGT TGA 20S5 
681 Cys Glu Phe Gly End 685 



Figure 16C(continued) 
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Figure No. l^Bankia gouldi < 37 gp4) 

"========r==== == - ==== 

- ========;=;=== ====== 

3 01 GGA AAT GGC TAT att « 

101 Gly AS n Gly Z m Asp P T ^ A " ^ Q " « 

yr ASP ser Pro Gln olu olB Glu Ma Lys ne ^ ^ ^ 

»: z z z z z z z rr""""""- » - - 

II. T„ „„ 01u p„ n , ^ 0Jn Jtr ^ ^ m ^ ^ 
- , = = = = = = = Z = = = = = = = = - = = = 

===============s ==:== 

- " ™ s " r z " r r " " m - c ^ - — « » « « 

'" ™" w * "» - *• «■ «P u. «. *.„ 

« 2 z s :™ ™ „ 

<~ A,p „„ ».„ „, ^ ^ mu ^ ^ ^ ^ 
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900 
00 



::: -'^^ZZZZZZZizzzz ::: 

9«1 TCT GOT GAA ATT GTA AAA AAC ATC ATC raa 

Glu cys n. Arg Ala Ala Met Glu Thr Ala Gin Ala 360 
«■ rr= Gl» a«, w „„ 01 „ ^ ^ nt ou ^ u> 

» » " Z Z Z Z ZZZZZ??""—'**' » » «. 

Tyr Gly Ser Ala Asn Gly Asn Ser Thr ^ Pro u- ^ 
1201 TTA AGA GGC GAA AGC GCT ap» i». 

"61 TAC CTA TTA AGT ATT GAA GOT GAT TAT TGG AAT w „ 

«1 IVr Leu uu Ser He Glu Gly TJ P T yr £ Z 7 "* ™ ^ ™ A « <*0 » 2 0 

/ P Trp Asn lie Lya Asp Ile Glu pfce ^ ^ ^ ^ 

y A19 Thr Lys Pro Gly Phe Gly Glu Gly Leu Tyr Val Gly SCO 

= ========r=; ===s====: - 



Figure 17b (continued) 



WO 98/24799 



43/46 



PCT/US97/22623 



1521 ACT ATT ATA AGA AAT TGC GTG Ttt 

■« - »• - - - *. « Z 2 - ~ - - » - « « « » « 

^7 He Ser Gly Glu Asn Ser Ser Asp sso 

1801 GGT TTT AGA AAT GCA ATA TTT r** « « _ . 

« ™ - ™ t ™ - « r t z ™ - - - - »- 

y Pro Glu Gin Thr His Val Trp Asp Asn He Ar g «„ 
1S21 AAC CCT AAT TCT GTT GAT ttt «-» . 

« - - - - ~ Z Z z z z z z z z zz z z z 

--^-^zzzzzzzzzzz- - 

f inr Asn Gin Ala Pro Thr 68 0 

- ==============s- === - « 

« ===========s==- = - = - = - 

&±y His Ser Asp Ser Pro Asn 740 

™-»>-™ZZZzzzzzzzzzzz- - 

ya Ala J l« Ala Thr Asp 760 

2281 AAC GAC GGG GCT rrr 

- - «.p «, «. - ~ s ~ ™ « » » « » ? - « „ „ „.. 

inr val lie Thr Glu Gin Ser Pro 780 

2341 TCT GAG AAT TGT GAP ttt 

» - - ~ *. * :i z z z z z z z zzz— »- 

V Leu Glu Asp p he Asp He Lys 800 
Figure 17a.( continued) 
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801 Lys Phe Ser Asn Val Ph8 Glu . 

«« Uu Gly 8er 01y 01y pro Ser ^ ^ ^ ^ ^ 

"========:=r=sr=rr=s= 

258! GCA AAT CCA GAA ATA TCT ATT AGC AAT AOC TTA ATT ^ 

8«1 Al« Asa Pro Glu Ile Ser Ue Ser Z T 7 AAT TTT GAT GGT cat TAC TOG 

Ser Asn Ser Uu He Pro Asn P he Asp Gl y Asp Tyr Trp 

2641 GTA ACA.TCA GAT AAC GGT AAT TTT GTO Arr «■» 

- - - ~ _ ^ _ ™ r r c ~ z s " r ™ ra — 

a=f iiys Tnr Asn Asn Phe Thr He Tyr 

«. Pro lie Cys Asn Val Thr Pro Ser Asn Gl„ „. Ser ty8 
2751 ATT A" GAT GAT TCT AGT ATT AAT TTT iir - 
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920 

2B20 
940 



Figure 17d(continued) 
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Figure No. ia* Pyococcus furiosus VC1(7EG1) 
leader sequence: amino acids 1-24 



9 18 



5 1 



ATG AGC AAG AAA AAG TTC GTC ATr rll m " 45 54 

«e, Ser Ly S Lya Lys Phe C I "I V G ^ Z £ ™ - CTT TTA GTA CAG 

val Ser He Leu Thr lie Leu Leu Val Gin 



90 qa 
Y Tyr Hl3 Thr Ser Glu Asp Lys S er Thr Ser Asn 



117 



XJ3 144 ici 

ACC TCA.TCT ACA CCA CCC CAA ACA ACA CTT TCC ACT Am 

- ~ ~ », » ,„ M „ ttr « ~ e s 



171 180 189 198 



AGA TAC CCT GAT GAC GOT GAG TGG CCA GGA GCT CCT ATT GAT ^ c 
Arg Tyr Pro A S p Asp Gly Glu Trp Pro Gly ^ ^ *™ « T ** G T ™ 

f v>ay Ala Pro He Asp Lys Asp Gly Asp 



" 5 "« 243 2S2 



zzz2z™ s s r r ™ - - « - « « « 

Tyr He Glu He Asn Leu Trp Asn He Leu Asn Ala Thr 
" 9 288 2" 306 

z z 2 z z s £ r r ? - ~ •* 

Thr Tyr Asn Leu Thr Ser Gly val Leu His Tyr Val Gin 



333 342 -se-. 

J42 351 360 



- z a:: e ni :ii r r - aga agt - : - - - « - 

P Asn lie val Leu Arg A Sp Ar g Ser Asn Trp v.l His 01y Tyr Pro 

387 396 4 °S 414 

CAA ATA TTC TAT GGA AAC AAG CCA TGG AAT GCA AAC TAC GCA A^ , 

Glu He Phe Tvr eiv a* , GCA ACT GAT GGC CCA 

Tyr Gly Asn Lys Pro Trp Asn Ala Asn Tyr Ala Thr Asp Gly Pro 

441 450 «9 < 6e 

ATA CCA TTA CCC ACT AAA GTT TCA AAC CTA n ^ 

He Pro Leu Pro Ser Lva T * C TTC ™ CTA ACA A *C TCC 

Ser Ly, val Ser Asn Leu Thr Asp Phe Tyr Leu Thr He Ser 
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495 S04 513 

TAT AAA CTT GAG CCC AAG AAC GGC CTT. r n ,™ *" 5 *° 

549 558 567 

«A ACG AGA GAA GCT TGG AGA ACA ACA GGA , w 594 
- r.r Ar g Glu Ma Trp Arg ^ 2 « ^ - AGC CAT GAG CAA GAA GTA 

Y iie Asn Ser Asp Glu Gin ciu Val 

603 S " S21 

ATG ATA TGG ATT TAC TAT GAC GGA TTA CAA CCG GCT rm 

« - », n. ^ „ ,„ 01y _ ~ « « « ~ 

666 67S 

ATT GTA GTC CCA ATA ATA GTT AAC GGA »n. ~ " 3 702 

TGG AAG GCA AAC ATT GGT TGG GAG TAT GTT GCA ~ ^ 756 

765 774 '83 792 

AAA GAG GGA ACA GTG ACA ATT m» „»„ „ 601 810 

G1U Gl y Thr val £ £ p CCA ™ GCA «* AGT GTT GCA GCC AAC 

Thr He Pro Tyr Gly Ala r he Ile Ser ^ ^ ^ 

819 "8 837 

ATT TCA AGC TTA CCA AAT TAC apa na* ~ 855 864 

873 882 891 

ACT GAG TTT GGA ACG CCA AGC ACT Arc -rnn „ ^ 9 ° 9 918 

* «. ». «, * ,„ zzzz ~ ~ r 0,0 ro tk irc - 

Ala His Leu Glu Trp Trp H e Thr 

927 93 6 945 

MC ATA ACA CTA ACT CCT CTA GAT AGA CCT CTT m TCC ™ 

^ - - - MP Ar 9 Pro £ £ £ .. 



Figure 18b(continued) 
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